统计理论
Off-policy evaluation (OPE) constructs confidence intervals for the value of a target policy using data generated under a different behavior policy. Most existing inference methods focus on fixed target policies and may fail when the target…
Factor analysis models explain dependence among observed variables by a smaller number of unobserved factors. A main challenge in confirmatory factor analysis is determining whether the factor loading matrix is identifiable from the…
We study the problem of finding the index of the minimum value of a vector from noisy observations. This problem is relevant in population/policy comparison, discrete maximum likelihood, and model selection. We develop an asymptotically…
The present work aims at proving mathematically that a neural network inspired by biology can learn a classification task thanks to local transformations only. In this purpose, we propose a spiking neural network named CHANI…
Most of the literature on differential privacy considers the item-level case where each user has a single observation, but a growing field of interest is that of user-level privacy where each of the $n$ users holds $T$ observations and…
Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. The present paper attempts to take a step back and…
The Ridgeless minimum $\ell_2$-norm interpolator in overparametrized linear regression has attracted considerable attention in recent years in both machine learning and statistics communities. While it seems to defy conventional wisdom that…
Modern data analysis depends increasingly on estimating models via flexible high-dimensional or nonparametric machine learning methods, where the identification of structural parameters is often challenging and untestable. In linear…
This article is an exposition on some recent theoretical advances in learning latent structured models, with a primary focus on the fundamental roles that optimal transport distances play in the statistical theory. We aim at what may be the…
This work presents the first systematic development of Stein's method for matrix distributions. We establish the basic essential ingredients of Stein's method for matrix normal approximation: we derive a generator-based Stein identity from…
We derive the unique e-values with optimal (relative) growth rate in the worst case for testing the mean of a bounded random variable, hereby contributing with the first application beyond the assumption of mutually absolutely continuous…
Penalized smoothing is a standard tool in regression analysis. Classical approaches often rely on basis or kernel expansions, which constrain the estimator to a fixed span and impose smoothness assumptions that may be restrictive for…
In this expository article, we summarize what is known about maximum likelihood thresholds of Gaussian models, paying special attention to connections with rigidity theory.
We present a general non-parametric statistical inference theory for integrals of quantiles without assuming any specific sampling design or dependence structure. Technical considerations are accompanied by examples and discussions,…
This paper introduces a framework for approximate message passing (AMP) in dynamic settings where the data at each iteration is passed through a linear operator. This framework is motivated in part by applications in large-scale,…
This paper studies extremal quantiles under two-way clustered dependence. We show that the limiting distribution of unconditional intermediate-order tail quantiles is Gaussian. This result is notable because two-way clustering typically…
Differential Privacy (DP) provides a rigorous framework for releasing statistics while protecting individual information present in a dataset. Although substantial progress has been made on differentially private linear regression, existing…
We consider general parameter to solution maps $\theta \mapsto \mathcal G(\theta)$ of non-linear partial differential equations and describe an approach based on a Banach space version of the implicit function theorem to verify the gradient…
Manifold fitting aims to reconstruct a low-dimensional manifold from high-dimensional data, whose framework is established by Fefferman et al. \cite{fefferman2020reconstruction,fefferman2021reconstruction}. This paper studies the recovery…
Sequential change detection is a fundamental problem in statistics and signal processing, with the CUSUM procedure widely used to achieve minimax detection delay under a prescribed false-alarm rate when pre- and post-change distributions…