统计理论 — Scifaro

New explanations and inference for least angle regression

Efron et al. (2004) introduced least angle regression (LAR) as an algorithm for linear predictions, intended as an alternative to forward selection with connections to penalized regression. However, LAR has remained somewhat of a "black…

统计理论 · 数学 2026-02-03 Karl B. Gregory , Daniel J. Nordman

A Kullback-Leibler divergence test for multivariate extremes: theory and practice

Testing whether two multivariate samples exhibit the same extremal behavior is an important problem in various fields including environmental and climate sciences. While several ad-hoc approaches exist in the literature, they often lack…

统计理论 · 数学 2026-02-03 Sebastian Engelke , Philippe Naveau , Chen Zhou

Handling Covariate Mismatch in Federated Linear Prediction

Federated learning enables institutions to train predictive models collaboratively without sharing raw data, addressing privacy and regulatory constraints. In the standard horizontal setting, clients hold disjoint cohorts of individuals and…

统计理论 · 数学 2026-02-03 Alexis Ayme , Rémi Khellaf

Estimating Conditional Distributions via Sklar's Theorem and Empirical Checkerboard Approximations, with Consequences to Nonparametric Regression

We tackle the natural question of whether it is possible to estimate conditional distributions via Sklar's theorem by separately estimating the conditional distributions of the underlying copula and the marginals. Working with so-called…

统计理论 · 数学 2026-02-03 Kai Schärer , Wolfgang Trutschnig

Improving Minimax Estimation Rates for Contaminated Mixture of Multinomial Logistic Experts via Expert Heterogeneity

Contaminated mixture of experts (MoE) is motivated by transfer learning methods where a pre-trained model, acting as a frozen expert, is integrated with an adapter model, functioning as a trainable expert, in order to learn a new task.…

统计理论 · 数学 2026-02-03 Fanqi Yan , Dung Le , Trang Pham , Huy Nguyen , Nhat Ho

Efficient Bayesian Inference in Strictly Semi-parametric Linear Inverse Problems

We consider the efficient inference of finite dimensional parameters arising in the context of inverse problems. Our setup is the observation of a transformation of an unknown infinite dimensional signal $f$ corrupted by statistical noise,…

统计理论 · 数学 2026-02-03 Adel Magra , Aad van der Vaart

Semi-parametric Bernstein-von Mises Theorem in a Parabolic PDE Problem

We consider the heat equation with absorption in a bounded domain of $\mathbb{R}^d$, where both the scalar diffusivity and the absorption function are unknown. We investigate a Bayesian approach for recovering the diffusivity from a noisy…

统计理论 · 数学 2026-02-03 Adel Magra , Frank van der Meulen , Aad van der Vaart

LSD of sample covariances of superposition of matrices with separable covariance structure

We study the asymptotic behavior of the spectra of matrices of the form $S_n = \frac{1}{n}XX^*$ where $X =\sum_{r=1}^K X_r$, where $X_r = A_r^\frac{1}{2}Z_rB_r^\frac{1}{2}$, $K \in \mathbb{N}$ and $A_r,B_r$ are sequences of positive…

统计理论 · 数学 2026-02-03 Javed Hazarika , Debashis Paul

Modifications of the BIC for order selection in finite mixture models

Finite mixture models are ubiquitous in modern statistical modeling, and a recurring practical issue is choosing the model order. In \citet[Sankhy\=a Series A, \textbf62, pp. 49--66]{keribin2000consistent}, the Bayesian information…

统计理论 · 数学 2026-02-03 Hien Duy Nguyen , TrungTin Nguyen

Multivariate Species Sampling Models

Species sampling processes have long served as the fundamental framework for modeling random discrete distributions and exchangeable sequences. However, data arising from distinct but related sources require a broader notion of…

统计理论 · 数学 2026-02-03 Beatrice Franzolini , Antonio Lijoi , Igor Prünster , Giovanni Rebaudo

Conditional Feature Importance revisited: Double Robustness, Efficiency and Inference

Conditional Feature Importance (CFI) is a classical variable importance measure that accounts for the relationship between the studied feature and the others. However, CFI has not yet been studied from a theoretical perspective because the…

统计理论 · 数学 2026-02-03 Angel Reyero-Lobo , Pierre Neuvial , Bertrand Thirion

High-order Accurate Inference on Manifolds

We present a new framework for statistical inference on Riemannian manifolds that achieves high-order accuracy, addressing the challenges posed by non-Euclidean parameter spaces frequently encountered in modern data science. Our approach…

统计理论 · 数学 2026-02-03 Chengzhu Huang , Anru R. Zhang

The envelope of a complex Gaussian random variable

The envelope of an elliptical Gaussian complex vector, or equivalently, the amplitude or norm of a bivariate normal random vector has application in many weather and signal processing contexts. We explicitly characterize its distribution in…

统计理论 · 数学 2026-02-03 Sattwik Ghosal , Ranjan Maitra

Semi-knockoffs: a model-agnostic conditional independence testing method with finite-sample guarantees

Conditional independence testing (CIT) is essential for reliable scientific discovery. It prevents spurious findings and enables controlled feature selection. Recent CIT methods have used machine learning (ML) models as surrogates of the…

统计理论 · 数学 2026-02-02 Angel Reyero-Lobo , Bertrand Thirion , Pierre Neuvial

Persuasive Privacy

We propose a novel framework for measuring privacy from a Bayesian game-theoretic perspective. This framework enables the creation of new, purpose-driven privacy definitions that are rigorously justified, while also allowing for the…

统计理论 · 数学 2026-02-02 Joshua J Bon , James Bailie , Judith Rousseau , Christian P Robert

Asymmetric conformal prediction with penalized kernel sum-of-squares

Conformal prediction (CP) is a distribution-free method to construct reliable prediction intervals that has gained significant attention in recent years. Despite its success and various proposed extensions, a significant practical feature…

统计理论 · 数学 2026-02-02 Louis Allain , Sébastien Da Veiga , Brian Staber

Convergence of Multi-Level Markov Chain Monte Carlo Adaptive Stochastic Gradient Algorithms

Stochastic optimization in learning and inference often relies on Markov chain Monte Carlo (MCMC) to approximate gradients when exact computation is intractable. However, finite-time MCMC estimators are biased, and reducing this bias…

统计理论 · 数学 2026-02-02 Antoine Godichon-Baggioni , Gabriel Lang , Sylvain Le Corff , Julien Stoehr , Sobihan Surendran

A spectral approach for online covariance change point detection

Change point detection in covariance structures is a fundamental and crucial problem for sequential data. Under the high-dimensional setting, most of the existing research has focused on identifying change points in historical data.…

统计理论 · 数学 2026-02-02 Zhigang Bao , Kha Man Cheong , Yuji Li , Jiaxin Qiu

Model-oriented Graph Distances via Partially Ordered Sets

A well-defined distance on the parameter space is key to evaluating estimators, ensuring consistency, and building confidence sets. While there are typically standard distances to adopt in a continuous space, this is not the case for…

统计理论 · 数学 2026-02-02 Armeen Taeb , F. Richard Guo , Leonard Henckel

On estimation of weighted cumulative residual Tsallis entropy for complete and censored samples

Recently, weighted cumulative residual Tsallis entropy has been introduced in the literature as a generalization of weighted cumulative residual entropy. We study some new properties of weighted cumulative residual Tsallis entropy measure.…

统计理论 · 数学 2026-02-02 Siddhartha Chakraborty , Asok K. Nanda