统计理论 — Scifaro

Fixed-order PCA: Theory for Overestimated Factor Models

We develop asymptotic theory for principal component analysis (PCA) of a high-dimensional factor model in which the working dimension $R$ is fixed and only required to satisfy $R \ge r$, where $r$ is the true number of factors. Building on…

统计理论 · 数学 2026-05-19 Yuan Liao , Xin Tong , Wanjie Wang , Dacheng Xiu

Multi-state model with temporal-consistent survival analysis for homogeneous Markov chains

In this study, we consider sequences drawn from time-homogeneous Markov chains and introduce a novel approach for estimating first hitting-time distributions to specified terminal states. Our method- ology is based on the…

统计理论 · 数学 2026-05-19 Mikael Escobar-Bach , Alexandre Popier , Malo Sahin

Uncertainty functionals revisited: Concavity and Jensen's inequality

This article presents a theoretical study of uncertainty functionals on general measurable spaces. These functionals are fundamental in experimental design and global sensitivity analysis, where they are used to quantify variability and…

统计理论 · 数学 2026-05-19 Julien Bect , Xujia Zhu

Self-Distillation is Optimal Among Spectral Shrinkage Estimators in Spiked Covariance Models

Self-distillation has emerged as a promising technique for improving model performance in modern machine learning systems. We develop the statistical foundations of self-distillation in spiked covariance models, by introducing and analyzing…

统计理论 · 数学 2026-05-19 Radu Lecoiu , Debarghya Mukherjee , Pragya Sur

Multivariate EDF tests for uniformity, normality,spherical and elliptical symetry, and independence based on a Brownian sheet deconstruction

This paper extends a recently proposed family of EDF-based goodness-of-fit procedures for the hypercube $[0,1]^p$ - the m-test and the s-test - which are based on a unique deconstruction of the $p$-parameter Brownian sheet into independent…

统计理论 · 数学 2026-05-19 Alejandra Cabaña , Enrique M. Cabaña

Quantifying Dependence Between Random Vectors: A New Index with Applications

This article proposes a new index for quantifying the degree of dependence between random vectors. The index takes values in [0,1] and equals zero if and only if the random vectors are sub-independent. Unlike mere uncorrelatedness,…

统计理论 · 数学 2026-05-19 Chuancun yin

Differentially private hypothesis testing in survival analysis

Survival analysis is widely used in applications involving sensitive individual-level data, yet differentially private hypothesis testing for right-censored data remains largely undeveloped. We initiate a finite-sample theory of private…

统计理论 · 数学 2026-05-19 Elly K. H. Hung , Yi Yu

Statistical Unlearning of Distributions: A Hypothesis Testing Approach

Machine learning systems increasingly face requirements to forget not only individual data points, but entire domains of information, such as toxic language, copyrighted corpora, or demographic biases. This raises a fundamental dilemma of…

统计理论 · 数学 2026-05-19 Aaradhya Pandey , Sanjeev Kulkarni

Asymptotic Anytime-Valid Inference for U-statistics

We study asymptotic anytime-valid confidence sequences for degree-two U-statistics under continuous monitoring. In the nondegenerate case, Hoeffding's projection reduces the problem to a time-uniform central limit theory for the partial…

统计理论 · 数学 2026-05-19 Leheng Cai , Qirui Hu , Weijia Li

The Threshold Breakdown Point

We introduce a novel approach to finite sample robustness that avoids the pessimism of traditional breakdown analyses. We define the threshold breakdown point, the smallest contamination fraction needed to induce a prescribed deviation, and…

统计理论 · 数学 2026-05-19 Tianjun Ke , Marco Avella Medina

Prediction Suboptimality of the Lasso in Sparse Linear Regression

The choice of the tuning parameter in the Lasso is central to its statistical performance in high-dimensional linear regression. In this work, we study tuning regimes under which the Lasso exhibits suboptimal prediction performance, in the…

统计理论 · 数学 2026-05-19 Guo Liu

Deconvolution in unlinked linear models

Unlinked regression, in which covariates and responses are observed separately without known correspondence, has recently gained increasing attention. Deconvolution, on the other hand, is a fundamental and challenging problem in…

统计理论 · 数学 2026-05-19 Fadoua Balabdaoui , Antonio Di Noia , Cécile Durot

Tuning free Catoni type joint robust estimation

This paper develops a Catoni-type joint (tuning-free) estimation framework for parametric models with heavy-tailed noise, in which the target parameter and the unknown noise variance are estimated simultaneously through a system of two…

统计理论 · 数学 2026-05-19 Xiang Li , Jun S. Liu , Qiang Sun , Lihu Xu

Quantifying the noise sensitivity of the Wasserstein metric for images

Wasserstein metrics are increasingly being used as similarity scores for images treated as discrete measures on a grid, yet their behavior under noise remains poorly understood. In this work, we consider the sensitivity of the signed…

统计理论 · 数学 2026-05-19 Erik Lager , Gilles Mordant , Amit Moscovich

Computable Bounds for Strong Approximations with Applications

The Koml\'os$\unicode{x2013}$Major$\unicode{x2013}$Tusn\'ady (KMT) inequality for partial sums is one of the most celebrated results in probability theory. Yet its practical application has been hindered by a lack of practical constants.…

统计理论 · 数学 2026-05-19 Haoyu Ye , Morgane Austern

Tests for the mean of high-dimensional data

We consider the problem of testing the mean of high-dimensional data when the dimension may grow without explicit rate restrictions relative to the sample size. The proposed procedure is based on the statistic V_n = n||Xn||^2, which avoids…

统计理论 · 数学 2026-05-18 Dietmar Ferger

Nearest-Neighbour Matching on Unbounded Supports and Covariate Shift Transfer

Expectations of multivariate functions with missing labels occur in various fields such as transfer learning and average treatment effects. Although non-parametric estimators based on nearest-neighbour matching are frequently used in this…

统计理论 · 数学 2026-05-18 Simon Viel

Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds

We study the classical problem of community recovery in stochastic block models with a fixed number of communities, with a twist: We seek algorithms that are stable with respect to node-wise changes in the graph structure, formally defined…

统计理论 · 数学 2026-05-18 Laurentiu Marchis , Ethan D'souza , Tomáš Flídr , Po-Ling Loh

Edge-indexed network time series with graph Ornstein-Uhlenbeck dynamics

We introduce a class of L\'evy-driven graph Ornstein-Uhlenbeck (grOU) models for edge-indexed network time series. The proposed framework extends generalized network autoregressive (GNAR) processes for edge-indexed network time series to…

统计理论 · 数学 2026-05-18 Jiaming Chen , Almut E. D. Veraart

Goodness-of-Fit Testing for Point Processes in Large Populations

Suppose we have an observed path from a point process counting event occurrences in a large population. Based on the observed path, we would like to test the null hypothesis that the conditional intensity of the point process belongs to a…

统计理论 · 数学 2026-05-18 Sami Umut Can , Estate V. Khmaladze , Roger J. A. Laeven