统计理论
A powerful robust mean estimator introduced by Catoni (2012) allows for mean estimation of heavy-tailed data while achieving the performance characteristics of classical mean estimator for sub-Gaussian data. While Catoni's framework has…
In network epidemic models, controlling the spread of a disease often requires targeted interventions such as vaccinating high-risk individuals based on network structure. However, typical approaches assume complete knowledge of the…
Predictive inference requires balancing statistical accuracy against informational complexity, yet the choice of complexity measure is usually imposed rather than derived. We treat econometric objects as predictive rules, mappings from…
We study variable selection (also called support recovery) in high-dimensional sparse linear regression when one has external information on which variables are likely to be associated with the response. Consistent recovery is only possible…
Total Variation Denoising (TVD) is a fundamental denoising and smoothing method. In this article, we identify a new local minmax/maxmin formula producing two estimators which sandwich the univariate TVD estimator at every point.…
We study nonparametric regression over Besov spaces from noisy observations under sub-exponential noise, aiming to achieve minimax-optimal guarantees on the integrated squared error that hold with high probability and adapt to the unknown…
We propose a hybrid estimation procedure to estimate global fixed parameters and subject-specific random effects in a mixed fractional Black-Scholes model based on discrete-time observations. Specifically, we consider $N$ independent…
Approximate Leave-One-Out Cross-Validation (ALO-CV) is a method that has been proposed to estimate the generalization error of a regularized estimator in the high-dimensional regime where dimension and sample size are of the same order, the…
In recent years, differential privacy has been adopted by tech-companies and governmental agencies as the standard for measuring privacy in algorithms. In this article, we study differential privacy in Bayesian posterior sampling settings.…
Geometric quantiles are location parameters which extend classical univariate quantiles to normed spaces (possibly infinite-dimensional) and which include the geometric median as a special case. The infinite-dimensional setting is highly…
We consider observations $(X,y)$ from single index models with unknown link function, Gaussian covariates and a regularized M-estimator $\hat\beta$ constructed from convex loss function and regularizer. In the regime where sample size $n$…
A novel nonparametric test for the equality of the covariance matrices of two Gaussian stationary processes, possibly of different lengths, is proposed. The test translates to testing the equality of two spectral densities and is shown to…
Let a graph be observed through a finite random sampling mechanism. Spectral methods are routinely applied to such graphs, yet their outputs are treated as deterministic objects. This paper develops finite-sample inference for spectral…
Recent advances in artificial intelligence have enabled the generation of large-scale, low-cost predictions with increasingly high fidelity. As a result, the primary challenge in statistical inference has shifted from data scarcity to data…
We prove asymptotic equivalence of nonparametric additive regression and an appropriate Gaussian white noise experiment in which a multidimensional shifted Wiener process is observed, whose dimension equals the number of additive…
We prove new, general versions of Bernstein-von Mises theorem for both well-specified and misspecified models when the log-likelihood is concave in the parameter and the prior distribution is log-concave. Unlike classical versions of…
When analysing extreme values, two alternative statistical approaches have historically been held in contention: the block maxima method (or annual maxima method, spurred by hydrological applications) and the peaks-over-threshold. Clamoured…
Causal inference is a central goal across many scientific disciplines. Over the past several decades, three major frameworks have emerged to formalize causal questions and guide their analysis: the potential outcomes framework, structural…
The limit distribution of the nonparametric maximum likelihood estimator for interval censored data with more than one observation time per unobservable observation, is still unknown in general. For the so-called separated case, where one…
Online learning is an inferential paradigm in which parameters are updated incrementally from sequentially available data, in contrast to batch learning, where the entire dataset is processed at once. In this paper, we assume that…