统计理论
Prediction sets offer a binary inclusion/exclusion for each element at the same fixed confidence level. We generalize to fuzzy prediction sets, which exclude elements at their own data-driven confidence level. Our key insight is that a…
We prove the asymptotic mixed normality of the least absolute deviation (LAD) estimator for a locally $\alpha$-stable stochastic differential equation (SDE) observed at high frequency, where $\alpha\in(0,2)$. We investigate both ergodic and…
This paper studies high-dimensional M-estimation in the proportional asymptotic regime (p/n -> gamma > 0) when the noise distribution has infinite variance. For noise with regularly-varying tails of index alpha in (1,2), we establish that…
In many practical and numerical inverse problems, the exact data log-likelihood is not fully accessible, motivating the use of surrogate models. We study heteroscedastic nonparametric nonlinear regression problems with Gaussian errors and…
Recently, several spectra have emerged, designed to encapsulate the distributional characteristics of non-Gaussian stationary processes. This article introduces parametric families of generalized spectra based on the characteristic…
The empirical Bayes $g$-modeling approach via the nonparametric maximum likelihood estimator (NPMLE) is widely used for large-scale estimation and inference in the normal means problem, yet theoretical guarantees for uncertainty…
Tuning parameters are parameters involved in an estimating procedure for the purpose of reducing the risk of some other estimator. Examples include the degree of penalization in penalized regression and likelihood problems, as well as the…
We investigate the structural foundations of statistical efficiency under $\alpha$-local differential privacy, with a focus on maximizing Fisher information. Building on the role of continuous staircase mechanisms, we identify a fundamental…
Motivated by applications in statistics and machine learning, we consider a problem of unmixing convex combinations of nonparametric densities. Suppose we observe $n$ groups of samples, where the $i$th group consists of $N_i$ independent…
Statistical estimation often involves tradeoffs between expensive, high-quality measurements and a variety of lower-quality proxies. We introduce Multiple-Prediction-Powered Inference (MultiPPI): a general framework for constructing…
We study the recovery of geometric structure from data generated by convolving the uniform measure on a smooth compact submanifold $M\subset\mathbb{R}^D$ with ambient Gaussian noise. Our main result is that several fundamental Riemannian…
This paper focuses on random projection operators when the subspace of projection is estimated. We derive non-asymptotic upper bounds on the error between the projection onto the estimated subspace and the projection onto the underlying…
In this paper, we study the estimation of drift and diffusion coefficients in a two dimensional system of N interacting particles modeled by a degenerate stochastic differential equation. We consider both complete and partial observation…
In this paper, we obtain new results on the weak and strong consistency of the maximum and integrated conditional likelihood estimators for the community detection problem in the Stochastic Block Model with $k$ communities and unknown…
The multi-index model with sparse dimension reduction matrix is a popular approach to circumvent the curse of dimensionality in a high-dimensional regression setting. Building on the single-index analysis by Alquier, P. & Biau, G. (Journal…
We study the problem of learning multivariate dependencies in nonparametric and high-dimensional settings. This includes but is not limited to graphical models. Our approach effectively combines several features that are missing from…
In multivariate extreme value statistics, the first step in understanding the dependence structure of extremes is identifying the directions in which they occur. The novelty of this paper is the analysis of high-dimensional extreme value…
We investigate the asymptotic properties of Bayesian bivariate causal discovery for Gaussian Linear Structural Equation Models (SEMs) with heteroscedastic noise. We demonstrate that with purely observational data, the posterior distribution…
A crucial assumption to reduce computational complexity in spatial-temporal data analysis is separability, which factors the covariance structure into a purely spatial and a purely temporal component. In this paper, we develop statistical…
Multiplex networks are a powerful framework for representing systems with multiple types of interactions among a common set of entities. Understanding their structure requires statistical tools capturing higher-order cross-layer…