统计理论
Differential privacy provides a formal framework for releasing statistical estimators that limit how much any single observation can influence the output, by injecting calibrated random noise. We study differentially private estimation in…
This article develops a moderate-deviation limit theory for autoregressive models with jointly persistent mean and volatility dynamics. The autoregressive coefficient is allowed to drift toward unity slower than the classical 1/n rate,…
We develop Gaussian approximations for high-dimensional vectors formed by second-order $U$- and $V$-statistics whose kernels depend on sample size under independent but not identically distributed (i.n.i.d.) sampling. Our results hold…
This paper introduces a robust and computationally efficient estimation framework for high-dimensional volatility models in the BEKK-ARCH class. The proposed approach employs data truncation to ensure robustness against heavy-tailed…
A new method of estimating population linear spectral statistics from high-dimensional data is introduced. When the dimension $d$ grows with the sample size $n$ such that $\frac{d}{n} \to c>0$, the proposed method is the first with proven…
We investigate the concept of an asymptotic e-process, which is a doubly-indexed stochastic process $(E_{m,n})_{m,n\in\mathbb{N}}$ that possesses, asymptotically for an approximation index $m\to\infty$, the properties of an e-process along…
An integer-valued moving average (INMA) model for count random fields is proposed and investigated. Closed-form expressions are derived for both its marginal distribution and spatial dependence structure, for arbitrary model order and also…
In the context of having an instrumental variable, the standard practice in causal inference begins by targeting an effect of interest and proceeds by formulating assumptions enabling its identification. We turn this around by adhering to…
This paper studies closed-form expressions for multiple association measures of copulas commonly used for approximation purposes, including Bernstein, shuffle--of--min, checkerboard and check--min copulas. In particular, closed-form…
We establish thresholds for the feasibility of random multi-graph alignment in two models. In the Gaussian model, we demonstrate an "all-or-nothing" phenomenon: above a critical threshold, exact alignment is achievable with high…
Chatterjee's rank correlation is a directed measure of association designed to detect whether one variable can be predicted as a function of another. While the original coefficient is naturally defined for real-valued data, circular data…
We introduce a new measure of robustness for statistical estimators, which we call \emph{empirical sensitivity}. An estimator $\hat \theta$ has bounded empirical sensitivity if, with high probability over a dataset $X = (X_1, \dots, X_n)…
In this paper we study Bayesian supervised learning models proposed by L\^e in \cite{Le2025}. We show the existence of Bayesian inversions on universal Bayesian supervised learning models $(\mathcal{P}(\mathcal{Y})^{\mathcal{X}}, \mu,…
Using i.i.d. data to estimate a high-dimensional distribution in Wasserstein distance is a fundamental instance of the curse of dimensionality. We explore how structural knowledge about the data-generating process which gives rise to the…
We investigate the optimization landscape of maximum likelihood estimation (MLE) for the Cavender-Farris-Neyman (CFN) model, a two-state latent tree model fundamental to statistical phylogenetics and the ferromagnetic Ising model. Although…
This manuscript studies a general approach to construct confidence sets for the solution of stochastic optimization, rendering empirical risk minimization as special cases. Statistical inference for stochastic optimization poses significant…
We introduce the data driven extreme value distribution (DDEVD) estimator, a kernel-based method for estimating extreme value distributions from data. We derive its mean integrated squared error (MISE) in detail, use it to compute the…
Optimal transport provides an inherently geometric and highly structured framework for studying spaces of probability measures, supplying a rich theoretical toolkit for contemporary statistics, machine learning, and generative modelling. In…
We study the problem of testing $H_0: \xi^\top\beta=t_0$ in high-dimensional sparse linear regression with Gaussian random design and unknown design covariance. The loading vector $\xi$ is arbitrary, and the exact sparsity level $k$ is…
Estimation under model misspecification arises in many signal processing problems, where the assumed observation model deviates from the true data-generating mechanism due to errors or simplifications. The misspecified Cram\'er-Rao bound…