统计理论 — Scifaro

Expectations of some ratio-type estimators under the gamma distribution

We study the expectations of some ratio-type estimators under the gamma distribution. Expectations of ratio-type estimators are often difficult to compute due to the nature that they are constructed by combining two separate estimators.…

统计理论 · 数学 2025-05-09 Jia-Han Shih

Sparse Regularized Optimal Transport without Curse of Dimensionality

Entropic optimal transport -- the optimal transport problem regularized by KL diver\-gence -- is highly successful in statistical applications. Thanks to the smoothness of the entropic coupling, its sample complexity avoids the curse of…

统计理论 · 数学 2025-05-09 Alberto González-Sanz , Stephan Eckstein , Marcel Nutz

Estimation of the long-run variance of nonlinear time series with an application to change point analysis

For a broad class of nonlinear time series known as Bernoulli shifts, we establish the asymptotic normality of the smoothed periodogram estimator of the long-run variance. This estimator uses only a narrow band of Fourier frequencies around…

统计理论 · 数学 2025-05-09 Vaidotas Characiejus , Piotr Kokoszka , Xiangdong Meng

Uncertainty quantification via cross-validation and its variants under algorithmic stability

Recently, there has been substantial interest in statistical guarantees for cross-validation (CV) methods of uncertainty quantification in statistical learning (cf. Barber et al. 2021a, Liang and Barber 2024, Steinberger and Leeb 2023).…

统计理论 · 数学 2025-05-09 Nicolai Amann , Hannes Leeb , Lukas Steinberger

Nonparametric Bayesian intensity estimation for covariate-driven inhomogeneous point processes

This work studies nonparametric Bayesian estimation of the intensity function of an inhomogeneous Poisson point process in the important case where the intensity depends on covariates, based on the observation of a single realisation of the…

统计理论 · 数学 2025-05-09 Matteo Giordano , Alisa Kirichenko , Judith Rousseau

A unified analysis of likelihood-based estimators in the Plackett--Luce model

The Plackett--Luce model has been extensively used for rank aggregation in social choice theory. A central statistical question in this model concerns estimating the utility vector that governs the model's likelihood. In this paper, we…

统计理论 · 数学 2025-05-09 Ruijian Han , Yiming Xu

PAC-Bayesian risk bounds for fully connected deep neural network with Gaussian priors

Deep neural networks (DNNs) have emerged as a powerful methodology with significant practical successes in fields such as computer vision and natural language processing. Recent works have demonstrated that sparsely connected DNNs with…

统计理论 · 数学 2025-05-08 The Tien Mai

Beyond entropic regularization: Debiased Gaussian estimators for discrete optimal transport and general linear programs

This work proposes new estimators for discrete optimal transport plans that enjoy Gaussian limits centered at the true solution. This behavior stands in stark contrast with the performance of existing estimators, including those based on…

统计理论 · 数学 2025-05-08 Shuyu Liu , Florentina Bunea , Jonathan Niles-Weed

Principal Curves In Metric Spaces And The Space Of Probability Measures

We introduce principal curves in Wasserstein space, and in general compact metric spaces. Our motivation for the Wasserstein case comes from optimal-transport-based trajectory inference, where a developing population of cells traces out a…

统计理论 · 数学 2025-05-08 Andrew Warren , Anton Afanassiev , Forest Kobayashi , Young-Heon Kim , Geoffrey Schiebinger

Double Cross-fit Doubly Robust Estimators: Beyond Series Regression

Doubly robust estimators with cross-fitting have gained popularity in causal inference due to their favorable structure-agnostic error guarantees. However, when additional structure, such as H\"{o}lder smoothness, is available then more…

统计理论 · 数学 2025-05-08 Alec McClean , Sivaraman Balakrishnan , Edward H. Kennedy , Larry Wasserman

Functional Partial Least-Squares: Adaptive Estimation and Inference

We study the functional linear regression model with a scalar response and a Hilbert space-valued predictor, a canonical example of an ill-posed inverse problem. We show that the functional partial least squares (PLS) estimator attains…

统计理论 · 数学 2025-05-08 Andrii Babii , Marine Carrasco , Idriss Tsafack

Maximum likelihood estimation for the $\lambda$-exponential family

The $\lambda$-exponential family generalizes the standard exponential family via a generalized convex duality motivated by optimal transport. It is the constant-curvature analogue of the exponential family from the information-geometric…

统计理论 · 数学 2025-05-07 Xiwei Tian , Ting-Kam Leonard Wong , Jiaowen Yang , Jun Zhang

Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime

We rigorously analyse fully-trained neural networks of arbitrary depth in the Bayesian optimal setting in the so-called proportional scaling regime where the number of training samples and width of the input and all inner layers diverge…

统计理论 · 数学 2025-05-07 Francesco Camilli , Daria Tieplova , Eleonora Bergamin , Jean Barbier

Depth based trimmed means

Robust estimation of location is a fundamental problem in statistics, particularly in scenarios where data contamination by outliers or model misspecification is a concern. In univariate settings, methods such as the sample median and…

统计理论 · 数学 2025-05-07 Alejandro Cholaquidis , Ricardo Fraiman , Leonardo Moreno , Gonzalo Perera

Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics

In this paper, we consider the estimation of regression coefficients and signal-to-noise (SNR) ratio in high-dimensional Generalized Linear Models (GLMs), and explore their implications in inferring popular estimands such as average…

统计理论 · 数学 2025-05-07 Xingyu Chen , Lin Liu , Rajarshi Mukherjee

Valid Heteroskedasticity Robust Testing

Tests based on heteroskedasticity robust standard errors are an important technique in econometric practice. Choosing the right critical value, however, is not simple at all: conventional critical values based on asymptotics often lead to…

统计理论 · 数学 2025-05-07 Benedikt M. Pötscher , David Preinerstorfer

Hierarchical random measures without tables

The hierarchical Dirichlet process is the cornerstone of Bayesian nonparametric multilevel models. Its generative model can be described through a set of latent variables, commonly referred to as tables within the popular restaurant…

统计理论 · 数学 2025-05-06 Marta Catalano , Claudio Del Sole

Mallows-type model averaging: Non-asymptotic analysis and all-subset combination

Model averaging (MA) and ensembling play a crucial role in statistical and machine learning practice. When multiple candidate models are considered, MA techniques can be used to weight and combine them, often resulting in improved…

统计理论 · 数学 2025-05-06 Jingfu Peng

Persistence-based Modes Inference

We address the problem of estimating multiple modes of a multivariate density using persistent homology, a central tool in Topological Data Analysis. We introduce a method based on the preliminary estimation of the $H_0$-persistence diagram…

统计理论 · 数学 2025-05-06 Hugo Henneuse

Multivariate Gaussian Approximation for Random Forest via Region-based Stabilization

We derive Gaussian approximation bounds for $k$-Potential Nearest Neighbor ($k$-PNN) based random forest predictions based on a set of training points given by a Poisson process under fairly mild regularity assumptions on the data…

统计理论 · 数学 2025-05-06 Zhaoyang Shi , Chinmoy Bhattacharjee , Krishnakumar Balasubramanian , Wolfgang Polonik