统计理论
We study an empirical interpretation of the Pitman efficiency in testing for uniformity in the two-parametric family of the beta distributions. We show that for contamination models the Pitman efficiency approximates relative efficiency…
We study the Kullback--Leibler (KL) divergence approximation theory of Gaussian mixture models (GMMs) by isolating an abstract mechanism behind several necessary-and-sufficient statements. The necessity direction is universal: if a density…
We establish the asymptotic validity of frequency-domain inference for stationary multivariate Hawkes processes under mild conditions, bridging the gap between theory and application. By developing upper-bounds on the reduced cumulant…
Estimating the number of communities is a fundamental problem in network analysis under the stochastic block model (SBM). In this paper, we study penalized estimators for this task based on normalized likelihood criteria. We show that a…
Recent research in statistics has focused on dependence measures kappa(Y,X) taking values in [0, 1], where 0 characterizes independence of X and Y, and 1 perfect functional dependence of Y on X. One class of such measures consists of the…
We study the problem of robust mean estimation with adversarially contaminated data under star-shaped constraints in a heavy-tailed noise setting, where only a finite second moment $ \sigma ^2 $ is assumed. For a contamination level $…
Estimating causal effects of continuous treatments is a common problem in practice, for example, in studying average dose-response functions. Classical analyses typically assume that all confounders are fully observed, whereas in real-world…
We propose a new probabilistic characterization of the uniform distribution on the hypersphere in terms of the distribution of pairwise inner products, extending the ideas of \citep{cuesta2009projection,cuesta2007sharp} in a data-driven…
While measures of concordance -- such as Spearman's rho, Kendall's tau, and Blomqvist's beta -- are continuous with respect to weak convergence, Chatterjee's rank correlation xi recently introduced in Azadkia and Chatterjee (2021) does not…
This article develops limit laws for network sampling based estimates of subgraph counts and clustering coefficient of a large population network, and uses them for predictive inference. A model based approach is used, where the population…
We propose a generalized debiased Lasso estimator based on a stability principle. When a single column of the design matrix is perturbed, the estimator admits a simple update formula that can be computed from the original solution. Under…
In applications of Bayesian procedures, once a class of priors has been chosen, it may be tempting to fix the prior's hyperparameters from the data, in an empirical Bayes (EB) fashion, usually by their maximum marginal likelihood estimates…
The block maxima (BM) approach in extreme value analysis fits a sample of block maxima to the Generalized Extreme Value (GEV) distribution. We consider all potential blocks from a sample, which leads to the All Block Maxima (ABM) estimator.…
We study harmonic map regression, a nonparametric estimator for manifold-valued responses, that penalizes the empirical Fr\'echet risk by the Dirichlet energy. By connecting penalized regression to the theory of harmonic maps, the estimator…
We consider the problem of testing whether a multivariate distribution is axially symmetric about some unknown direction. Under a simple-spectrum assumption on the covariance matrix, any symmetry axis must coincide with an eigenvector of…
We study community detection in stochastic block models under pure node-level differential privacy, a stringent notion that protects the participation of an individual together with all of their incident edges. This setting is substantially…
This paper studies the degree to which a bivariate copula fails to be symmetric under coordinate permutation, a property known as non-exchangeability. Working within an axiomatic framework that quantifies this asymmetry through a family of…
Statistical inference based on optimal transport offers a different perspective from that of maximum likelihood, and has increasingly gained attention in recent years. In this paper, we study univariate nonparametric shape-constrained…
We evaluate the misclustering probability of a spectral clustering algorithm under a Gaussian mixture model with a general covariance structure. The algorithm partitions the data into two groups based on the sign of the first principal…
Let $N$ components be partitioned into two communities, denoted ${\cal P}_+$ and ${\cal P}_-$, possibly of different sizes. Assume that they are connected via a directed and weighted Erd\"os-R\'enyi (DWER) random graph with unknown…