统计理论 — Scifaro

On the Variance, Admissibility, and Stability of Empirical Risk Minimization

It is well known that Empirical Risk Minimization (ERM) may attain minimax suboptimal rates in terms of the mean squared error (Birg\'e and Massart, 1993). In this paper, we prove that, under relatively mild assumptions, the suboptimality…

统计理论 · 数学 2025-11-04 Gil Kur , Eli Putterman , Alexander Rakhlin

A smooth transition from Wishart to GOE

It is well known that an $n \times n$ Wishart matrix with $d$ degrees of freedom is close to the appropriately centered and scaled Gaussian Orthogonal Ensemble (GOE) if $d$ is large enough. Recent work of Bubeck, Ding, Eldan, and Racz, and…

统计理论 · 数学 2025-11-04 Miklos Z. Racz , Jacob Richey

Advanced Distribution Theory for Significance in Scale Space

Smoothing methods find signals in noisy data. A challenge for Statistical inference is the choice of smoothing parameter. SiZer addressed this challenge in one-dimension by detecting significant slopes across multiple scales, but was not a…

统计理论 · 数学 2025-11-03 Rui Liu , Jan Hannig , J. S. Marron

Testing and estimation in orthosymmetric Gaussian sequence model

We study the Gaussian sequence model, i.e. $X \sim N(\mathbf{\theta}, I_\infty)$, where $\mathbf{\theta} \in \Gamma \subset \ell_2$ is assumed to be convex and compact. We show that goodness-of-fit testing sample complexity is lower bounded…

统计理论 · 数学 2025-11-03 Zeyu Jia , Yury Polyanskiy

Adversarially robust clustering with optimality guarantees

We consider the problem of clustering data points coming from sub-Gaussian mixtures. Existing methods that provably achieve the optimal mislabeling error, such as the Lloyd algorithm, are usually vulnerable to outliers. In contrast,…

统计理论 · 数学 2025-11-03 Soham Jana , Kun Yang , Sanjeev Kulkarni

Linear regression with known noise distribution up to a scale: The reward of not using the OLSE

While the ordinary least squares estimator (OLSE) is still the most used estimator in linear regression models, other estimators can be more efficient when the error distribution is not Gaussian. In this paper, our goal is to evaluate this…

统计理论 · 数学 2025-10-31 Fadoua Balabdaoui , Justine Leclerc

A theoretical comparison of weight constraints in forecast combination and model averaging

Forecast combination and model averaging have become popular tools in forecasting and prediction, both of which combine a set of candidate estimates with certain weights and are often shown to outperform single estimates. A data-driven…

统计理论 · 数学 2025-10-31 Jiahui Zou , Andrey Vasnev , Wendun Wang , Xinyu Zhang

Fixed and Increasing Domain Asymptotics for the Roughness and Scale of Isotropic Gaussian Random Fields

We establish a rigorous asymptotic theory for the joint estimation of roughness and scale parameters in two-dimensional Gaussian random fields with power-law generalized covariances \cite{Matheron1973, Stein1999, Yaglom1987}. Our main…

统计理论 · 数学 2025-10-31 Varun Kotharkar , Michael L. Stein

Deconvolution of distribution functions without integral transforms

We study the recovery of the distribution function $F_X$ of a random variable $X$ that is subject to an independent additive random error $\varepsilon$. To be precise, it is assumed that the target variable $X$ is available only in the form…

统计理论 · 数学 2025-10-31 Henrik Kaiser

Maximum Likelihood Estimation in the Multivariate and Matrix Variate Symmetric Laplace Distributions through Group Actions

In this paper, we study the maximum likelihood estimation of the parameters of the multivariate and matrix variate symmetric Laplace distributions through group actions. The multivariate and matrix variate symmetric Laplace distributions…

统计理论 · 数学 2025-10-31 Pooja Yadav , Tanuja Srivastava

A Fourier-based inference method for learning interaction kernels in particle systems

We consider the problem of inferring the interaction kernel of stochastic interacting particle systems from observations of a single particle. We adopt a semi-parametric approach and represent the interaction kernel in terms of a…

统计理论 · 数学 2025-10-31 Grigorios A. Pavliotis , Andrea Zanoni

Parameter estimation from local measurements for a class of stochastic Burgers equations

We deal with a class of semilinear SPDEs driven by space-time white noise that includes the one dimensional stochastic Burgers equation. Such equations can have nonlocal and quadratic nonlinearities. We consider the problem of estimation of…

统计理论 · 数学 2025-10-31 Josef Janák , Enrico Priola

Survey Data Integration for Distribution Function Estimation

Estimates of finite population cumulativedistribution functions (CDFs) and quantiles are critical forpolicy-making, resource allocation, and public health planning. For instance, federal finance agencies may require accurate estimates of…

统计理论 · 数学 2025-10-31 Jeremy Flood , Sayed Mostafa

Estimation of discrete distributions with high probability under $\chi^2$-divergence

We investigate the high-probability estimation of discrete distributions from an \iid sample under $\chi^2$-divergence loss. Although the minimax risk in expectation is well understood, its high-probability counterpart remains largely…

统计理论 · 数学 2025-10-30 Sirine Louati

Stochastic Optimization in Semi-Discrete Optimal Transport: Convergence Analysis and Minimax Rate

We investigate the semi-discrete Optimal Transport (OT) problem, where a continuous source measure $\mu$ is transported to a discrete target measure $\nu$, with particular attention to the OT map approximation. In this setting, Stochastic…

统计理论 · 数学 2025-10-30 Ferdinand Genans , Antoine Godichon-Baggioni , François-Xavier Vialard , Olivier Wintenberger

A Modern Theory of Cross-Validation through the Lens of Stability

Modern data analysis and statistical learning are marked by complex data structures and black-box algorithms. Data complexity stems from technologies such as imaging, remote sensing, wearable devices, and genomic sequencing. At the same…

统计理论 · 数学 2025-10-30 Jing Lei

Multiple imputation and full law identifiability

The central challenges in missing data models concern the identifiability of two distributions: the target law and the full law. The target law refers to the joint distribution of the data variables, whereas the full law refers to the joint…

统计理论 · 数学 2025-10-30 Juha Karvanen , Santtu Tikka

Estimation in linear high dimensional Hawkes processes: a Bayesian approach

In this paper we study the frequentist properties of Bayesian approaches in linear high dimensional Hawkes processes in a sparse regime where the number of interaction functions acting on each component of the Hawkes process is much smaller…

统计理论 · 数学 2025-10-29 Judith Rousseau , Vincent Rivoirard , Déborah Sulem

Sparse estimation for the drift of high-dimensional Ornstein--Uhlenbeck processes with i.i.d. paths

We study sparsity-regularized maximum likelihood estimation for the drift parameter of high-dimensional non-stationary Ornstein--Uhlenbeck processes given repeated measurements of i.i.d. paths. In particular, we show that Lasso and Slope…

统计理论 · 数学 2025-10-29 Shogo Nakakita

Minimax Estimation Problem for Periodically Correlated Stochastic Processes

The problem of optimal linear estimation of linear functionals depending on the unknown values of a periodically correlated stochastic process from observations of the process with additive noise is considered. Formulas for calculating the…

统计理论 · 数学 2025-10-29 Iryna Dubovets'ka , Mykhailo Moklyachuk