统计理论
A new approach to producing multidisciplinary lists of highly cited researchers is described and used for compiling a multidisciplinary list of highly cited researchers. This approach is essentially related to the recently discovered law of…
We study model selection by the Bayesian information criterion (BIC) in fixed-dimensional exploratory factor analysis over a fixed finite family of compact covariance classes. Our main result shows that the BIC is strongly consistent for…
When multiple investigators analyze a common dataset, the data reuse induces dependence across testing procedures, affecting the distribution of errors. Existing techniques of managing dependent tests require either cross-study coordination…
The Optimal Transport (OT) problem with squared Euclidean cost consists in finding a coupling between two input measures that maximizes correlation. Consequently, the optimal coupling is often singular with respect to the Lebesgue measure.…
Kernel methods provide a flexible and powerful framework for nonparametric statistical testing by embedding probability distributions into a reproducing kernel Hilbert space (RKHS). In this work, we study the kernel two-sample testing…
We construct optimal low-rank approximations for the Gaussian posterior distribution in linear Gaussian inverse problems with possibly infinite-dimensional separable Hilbert parameter spaces and finite-dimensional data spaces. We first…
Consider a binary mixture model of the form $F_\theta = (1-\theta)F_0 + \theta F_1$, where $F_0$ is standard Gaussian and $F_1$ is a completely specified heavy-tailed distribution with the same support. For a sample of $n$ independent and…
The randomized unbiased estimators of Rhee and Glynn (Operations Research:63(5), 1026-1043, 2015) can be highly efficient at approximating expectations of path functionals associated with stochastic differential equations (SDEs). However,…
Under general assumptions on the target distribution $p^\star$, we establish a sharp Lipschitz regularity theory for flow-matching vector fields and diffusion-model scores, with optimal dependence on time and dimension. As applications, we…
We consider a statistical model of a n-mode quantum Gaussian state which is shift invariant and also gauge invariant. Such models can be considered analogs of classical Gaussian stationary time series, parametrized by their spectral…
Subspace clustering becomes inherently difficult near intersections, where points from different subspaces are barely separated. Most existing theoretical results address this issue by imposing separation or sampling assumptions that limit…
We introduce a new global sensitivity measure, the global activity scores. The measure is based on finite differences of the underlying function, in contrast to several sensitivity measures in the literature that are based on derivatives of…
Modeling observations as random distributions embedded within Wasserstein spaces is becoming increasingly popular across scientific fields, as it captures the variability and geometric structure of the data more effectively. However, the…
A birth-death-move process with mutations is a Markov model for a system of marked particles in interaction, that move over time, with births and deaths. In addition the mark of each particle may also change, which constitutes a mutation.…
Motivated by the problem of matching two correlated random geometric graphs, we study the problem of matching two Gaussian geometric models correlated through a latent node permutation. Specifically, given an unknown permutation $\pi^*$ on…
We study bootstrap inference for the $k$th largest coordinate of a normalized sum of independent high-dimensional random vectors. Existing second-order theory for maxima does not directly extend to order statistics, because the event…
Bayesian neural networks (BNNs) offer a natural probabilistic formulation for inference in deep learning models. Despite their popularity, their optimality has received limited attention through the lens of statistical decision theory. In…
In this paper, we study estimation of parameters in a two-parameter Potts model with $q$ colors and coupling matrix $A_N$. We characterize concrete sufficient conditions for existence of the pseudo-likelihood estimator of the Potts model,…
This paper studies the problem of recovering a hidden vertex correspondence between two correlated graphs when both edge weights and node features are observed. While most existing work on graph alignment relies primarily on edge…
We develop a rigorous theoretical framework for principal manifold estimation that recovers a latent low-dimensional manifold from a point cloud observed in a high-dimensional ambient space. Our framework accommodates manifolds with…