统计理论
In the paper, we derive an analytic formula for the ROC curves of the LDA classifiers. We establish elementary properties of these curves (monotonicity and concavity), provide formula for the area under curve (AUC) and compute the Youden…
Recently, a Wasserstein analogue of the Cramer--Rao inequality has been developed using the Wasserstein information matrix (Otto metric). This inequality provides a lower bound on the Wasserstein variance of an estimator, which quantifies…
We discuss optimal prediction for families of probability distributions with a locally compact topological group structure. Right-invariant priors were previously shown to yield a posterior predictive distribution minimizing the worst-case…
The present manuscript is concerned with component-wise estimation of the positive power of ordered restricted standard deviation of two normal populations with certain restrictions on the means. We propose several improved estimators under…
Although the existing causal inference literature focuses on the forward-looking perspective by estimating effects of causes, the backward-looking perspective can provide insights into causes of effects. In backward-looking causal…
A dataset with two labels is linearly separable if it can be split into its two classes with a hyperplane. This inflicts a curse on some statistical tools (such as logistic regression) but forms a blessing for others (e.g. support vector…
We use Markov categories to generalize the basic theory of Markov chains and hidden Markov models to an abstract setting. This comprises characterizations of hidden Markov models in terms of conditional independences and algorithms for…
We study the problem of designing consistent sequential two-sample tests in a nonparametric setting. Guided by the principle of testing by betting, we reframe this task into that of selecting a sequence of payoff functions that maximize the…
Large-scale registries have collected vast amounts of data which has enabled investigators to efficiently conduct studies of observational data. Common practice is for investigators to use all data meeting the inclusion criteria of their…
We study general M-estimators of location on Riemannian manifolds, extending classical notions such as the Frechet mean by replacing the squared loss with a broad class of loss functions. Under minimal regularity conditions on the loss…
In this paper, in a multivariate setting we derive near optimal rates of convergence in the minimax sense for estimating partial derivatives of the mean function for functional data observed under a fixed synchronous design over H\"older…
An asymptotic theory is established for linear functionals of the predictive function given by kernel ridge regression, when the reproducing kernel Hilbert space is equivalent to a Sobolev space. The theory covers a wide variety of linear…
We consider density estimation under measurement error with the Smoothness-Penalized Deconvolution (SPeD) estimator. The estimator has a tuning parameter regulating the smoothness of the estimate, and proper choice of this parameter is…
Conformal Prediction (CP) has recently received a tremendous amount of interest, leading to a wide range of new theoretical and methodological results for predictive inference with formal theoretical guarantees. However, the vast majority…
This work is concerned with the limiting spectral distribution of rank-based dependency measures in high dimensions. We provide distribution-free results for multivariate empirical versions of Kendall's $\tau$ and Spearman's $\rho$ in a…
We consider a time-ordered sequence of networks stemming from stochastic block models where nodes gradually change their memberships over time, and no network at any single time point contains sufficient signal strength to recover its…
Gaussian process (GP) regression is a Bayesian nonparametric method for regression and interpolation, offering a principled way of quantifying the uncertainties of predicted function values. For the quantified uncertainties to be…
Let $\{X_{1},\ldots,X_{N_1}\}$ and $\{Y_{1},\ldots,Y_{N_2}\}$ be two sequences of interdependent heterogeneous samples, where for $i=1,\ldots,N_{1},$ $X_{i}\sim \text{Kw-G}(x, \alpha_{i}, \gamma_{i};G)$ and for $i=1,\ldots,N_{2},$…
This article explores combinations of weighted bootstraps, like the Bayesian bootstrap, with the bootstrap $t$ method for setting approximate confidence intervals for the mean of a random variable in small samples. For this problem the…
Denoising diffusion probabilistic models (DDPMs) represent a recent advance in generative modelling that has delivered state-of-the-art results across many domains of applications. Despite their success, a rigorous theoretical understanding…