统计理论
In this paper, we elucidate the geometry of Stein's method of moments (SMoM). SMoM is a parameter estimation method based on the Stein operator, and yields a wide class of estimators that do not depend on the normalizing constant. We…
This paper studies the central limit theorems (CLTs) for linear spectral statistics (LSSs) of general sample covariance matrices, when the test functions belong to $C^3$, the class of functions with continuous third order derivatives. We…
This study presents a new procedure for necessary tests of multivariate normality based on the uniform distribution on the Stiefel manifold. We demonstrate that the test statistic, which is formed by the product of the scaled residual…
Asymptotic inference using functional principal component regression (FPCR) has long been considered difficult, largely because, upon any scalar scaling, the FPCR estimator fails to satisfy a central limit theorem, leading to the prevailing…
We investigate the asymptotic behavior of parametric Bayes estimators under a broad class of loss functions that extend beyond the classical translation-invariant setting. To this end, we develop a unified theoretical framework for loss…
Contemporary focus on selective inference has renewed interest in the theory of selection models. In this paper, we analyze the asymptotic properties of selection models built on independent and identically distributed observations. We show…
This note extends conformal e-prediction to cover the case where there is observed confounding between the random object $X$ and its label $Y$. We consider both the case where the observed data is IID and a case where some dependence…
In this paper, we study the problem of detecting multiple hidden submatrices in a large Gaussian random matrix when the planted signal is inhomogeneous across entries. Under the null hypothesis, the observed matrix has independent and…
Faithfulness is a common assumption in causal inference, often motivated by the fact that the faithful parameters of linear Gaussian and discrete Bayesian networks are typical, and the folklore belief that this should also hold for other…
Standard thresholding techniques for correlation matrices often destroy positive semidefiniteness. We investigate the construction of positive definite functions that vanish on specific sets $K \subseteq [-1,1)$, ensuring that the…
Deep learning models are being used for the analysis of parametric statistical models based on simulation-only frameworks. Bayesian models using normalizing flows simulate data from a prior distribution and are composed of two deep neural…
We establish a strong Gaussian approximation for high-dimensional non-degenerate U-statistics with diverging dimension. Under mild assumptions, we construct, on a sufficiently rich probability space, a Gaussian process that uniformly…
Part I of this series (arXiv:2602.09029) develops a sharp Gaussian (LAN/GDP) limit theory for neighboring shuffle experiments when the local randomizer is fixed and has full support bounded away from zero. The present paper characterizes…
We study asymmetric rank-one spiked tensor models in the high-dimensional regime, where the noise entries are independent and identically distributed with zero mean, unit variance, and finite fourth moment. This extends the classical…
The empirical Orlicz norm based on a random sample is defined as a natural estimator of the Orlicz norm of a univariate probability distribution. A law of large numbers is derived under minimal assumptions. The latter extends readily to a…
For some discretely observed path of oscillating Brownian motion with level of self-organized criticality $\rho_0$, we prove in the infill asymptotics that the MLE is $n$-consistent, where $n$ denotes the sample size, and derive its limit…
We advance the theory of parametric bootstrap in constructing highly efficient empirical best (EB) prediction intervals of small area means. The coverage error of such a prediction interval is of the order $O(m^{-3/2})$, where $m$ is the…
Gaussian Processes (GPs) are widely used to model dependencies in spatial statistics and machine learning. However, exact inference is computationally intractable for GP regression, with a time complexity of $O(n^3)$. The Vecchia…
The matrix normal model, i.e., the family of Gaussian matrix-variate distributions whose covariance matrices are the Kronecker product of two lower dimensional factors, is frequently used to model matrix-variate data. The tensor normal…
Exact regions between rank correlations describe the set of all pairs of values that two dependence measures can attain simultaneously on the same copula and thus yield sharp inequalities between them. In this paper, we determine the exact…