统计理论 — Scifaro

On Expectation Propagation and the Probabilistic Editor in some simple mixture problems

As for other latent-variable problems, exact Bayesian analysis is typically not practicable for mixture problems and approximate methods have been developed. Variational Bayes tends to produce approximate posterior distributions for…

统计理论 · 数学 2026-02-24 Nils Lid Hjort , Mike Titterington

A Selection Premium Decomposition for the Expected Maximum of Random Walks

When $K$ models are evaluated on the same validation set of size $n$, the selected winner's apparent performance is biased upward. Suppose $K$ models are evaluated on a shared sequence of i.i.d. observations $X_1,\dots, X_n$, where model…

统计理论 · 数学 2026-02-24 Victor H. de la Pena , Fangyuan Lin , Victor K. de la Pena

Concentration bounds for intrinsic dimension estimation using Gaussian kernels

We prove finite-sample concentration and anti-concentration bounds for dimension estimation using Gaussian kernel sums. Our bounds provide explicit dependence on sample size, bandwidth, and local geometric and distributional parameters,…

统计理论 · 数学 2026-02-24 Martin Andersson

High-Dimensional Asymptotics of Differentially Private PCA

In differential privacy, random noise is introduced to privatize summary statistics of a sensitive dataset before releasing them. The noise level determines the privacy loss, which quantifies how easily an adversary can detect a target…

统计理论 · 数学 2026-02-24 Youngjoo Yun , Rishabh Dudeja

Composite goodness-of-fit test with the Kernel Stein Discrepancy and a bootstrap for degenerate U-statistics with estimated parameters

This paper formally derives the asymptotic distribution of a goodness-of-fit test based on the Kernel Stein Discrepancy introduced in (Oscar Key et al., "Composite Goodness-of-fit Tests with Kernels", Journal of Machine Learning Research…

统计理论 · 数学 2026-02-24 Florian Brück , Veronika Reimoser , Fabian Baier

A New Class of Asymptotically Distribution-Free Smooth Tests

This article demonstrates how recent developments in the theory of empirical processes allow us to construct a new family of asymptotically distribution-free smooth tests. Their distribution-free property is preserved even when the…

统计理论 · 数学 2026-02-24 Xiangyu Zhang , Sara Algeri

Post-reduction inference for confidence sets of models

Sparsity in a regression context makes the model itself an object of interest, pointing to a confidence set of models as the appropriate presentation of evidence. A difficulty in areas such as genomics, where the number of candidate…

统计理论 · 数学 2026-02-24 Heather Battey , Daniel Garcia Rasines , Yanbo Tang

On the Study of Weighted Fractional Cumulative Residual Inaccuracy and its Dynamical Version with Applications

In recent years, there has been a growing interest in information measures that quantify inaccuracy and uncertainty in systems. In this paper, we introduce a novel concept called the Weighted Fractional Cumulative Residual Inaccuracy…

统计理论 · 数学 2026-02-24 Aman Pandey , Chanchal Kundu

Estimating quantile treatments without strict overlap

We consider the problem of estimating quantile treatment effects without assuming strict overlap , i.e., we do not assume that the propensity score is bounded away from zero. More specifically, we consider an inverse probability weighting…

统计理论 · 数学 2026-02-24 Marco Avella-Medina , Richard Davis , Gennady Samorodnitsky

Convergence rates for estimating multivariate scale mixtures of uniform densities

The Grenander estimator is a well-studied procedure for univariate nonparametric density estimation. It is usually defined as the Maximum Likelihood Estimator (MLE) over the class of all non-increasing densities on the positive real line.…

统计理论 · 数学 2026-02-24 Arlene K. H. Kim , Gil Kur , Adityanand Guntuboyina

Two step estimations via the Dantzig selector for models of stochastic processes with high-dimensional parameters

We consider the sparse estimation for stochastic processes with possibly infinite-dimensional nuisance parameters, by using the Dantzig selector which is a sparse estimation method similar to $Z$-estimation. When a consistent estimator for…

统计理论 · 数学 2026-02-24 Kou Fujimori , Koji Tsukuda

Quantitative concentration inequalities for the uniform approximation of the IDS

The integrated density of states (IDS) is a fundamental spectral quantity for quantum Hamiltonians modeling condensed matter systems, describing how densely energy levels are distributed. It can be interpreted as a volume-averaged spectral…

统计理论 · 数学 2026-02-23 Max Kämper , Christoph Schumacher , Fabian Schwarzenberger , Ivan Veselic

Minimax optimal adaptive structured transfer learning through semi-parametric domain-varying coefficient model

Transfer learning aims to improve inference in a target domain by leveraging information from related source domains, but its effectiveness critically depends on how cross-domain heterogeneity is modeled and controlled. When the conditional…

统计理论 · 数学 2026-02-23 Hanxiao Chen , Debarghya Mukherjee

Central limit theorem for the global clustering coefficient of random geometric graphs

The global clustering coefficient serves as a powerful metric for the structural analysis and comparison of complex networks. Random geometric graphs offer a realistic framework for representing the spatial constraints and geometry often…

统计理论 · 数学 2026-02-23 Mingao Yuan , Md. Niamul Islam Sium

Non-Stationary Covariance Functions for Spatial Data on Linear Networks

We introduce a novel class of non-stationary covariance functions for random fields on linear networks that allows both the variance and the correlation range of the random field to vary spatially. The proposed covariance functions are…

统计理论 · 数学 2026-02-23 Alfredo Alegría

Spectral Thresholds in Correlated Spiked Models and Fundamental Limits of Partial Least Squares

We provide a rigorous random matrix theory analysis of spiked cross-covariance models where the signals across two high-dimensional data channels are partially aligned. These models are motivated by multi-modal learning and form the…

统计理论 · 数学 2026-02-23 Pierre Mergny , Lenka Zdeborová

Asymptotically Optimal Sequential Testing with Markovian Data

We study one-sided and $\alpha$-correct sequential hypothesis testing for data generated by an ergodic Markov chain. The null hypothesis is that the unknown transition matrix belongs to a prescribed set $P$ of stochastic matrices, and the…

统计理论 · 数学 2026-02-20 Alhad Sethi , Kavali Sofia Sagar , Shubhada Agrawal , Debabrota Basu , P. N. Karthik

Optimal Unconstrained Self-Distillation in Ridge Regression: Strict Improvements, Precise Asymptotics, and One-Shot Tuning

Self-distillation (SD) is the process of retraining a student on a mixture of ground-truth labels and the teacher's own predictions using the same architecture and training data. Although SD has been empirically shown to often improve…

统计理论 · 数学 2026-02-20 Hien Dang , Pratik Patil , Alessandro Rinaldo

A Note on Inferential Decisions, Errors and Path-Dependency

Consider the sequential testing of binary outcomes. The a posteriori belief process and its objective conditional-probability counterpart generally differ but converge to the same result in well-defined tests. We show that unless the two…

统计理论 · 数学 2026-02-20 Kangda K. Wren

Offline changepoint localization using a matrix of conformal p-values

Changepoint localization is the problem of estimating the index at which a change occurred in the data generating distribution of an ordered list of data, or declaring that no change occurred. We present the broadly applicable MCP…

统计理论 · 数学 2026-02-20 Sanjit Dandapanthula , Aaditya Ramdas