统计理论 — Scifaro

Improving the Convergence Rates of Forward Gradient Descent with Repeated Sampling

Forward gradient descent (FGD) has been proposed as a biologically more plausible alternative of gradient descent as it can be computed without backward pass. Considering the linear model with $d$ parameters, previous work has found that…

统计理论 · 数学 2024-11-27 Niklas Dexheimer , Johannes Schmidt-Hieber

On the Symmetry of Limiting Distributions of M-estimators

Many functionals of interest in statistics and machine learning can be written as minimizers of expected loss functions. Such functionals are called $M$-estimands, and can be estimated by $M$-estimators -- minimizers of empirical average…

统计理论 · 数学 2024-11-27 Arunav Bhowmick , Arun Kumar Kuchibhotla

Optimal Estimation of Shared Singular Subspaces across Multiple Noisy Matrices

Estimating singular subspaces from noisy matrices is a fundamental problem with wide-ranging applications across various fields. Driven by the challenges of data integration and multi-view analysis, this study focuses on estimating shared…

统计理论 · 数学 2024-11-27 Zhengchi Ma , Rong Ma

Information and Complexity Analysis of Spatial Data

Information Theory provides a fundamental basis for analysis, and for a variety of subsequent methodological approaches, in relation to uncertainty quantification. The transversal character of concepts and derived results justifies its…

统计理论 · 数学 2024-11-27 Jose M. Angulo , Francisco J. Esquivel , Ana E. Madrid , Francisco J. Alonso

Limiting Spectral Distribution of a Random Commutator Matrix

We study the spectral properties of a class of random matrices of the form $S_n^{-} = n^{-1}(X_1 X_2^* - X_2 X_1^*)$ where $X_k = \Sigma^{1/2}Z_k$, for $k=1,2$, $Z_k$'s are independent $p\times n$ complex-valued random matrices, and…

统计理论 · 数学 2024-11-27 Javed Hazarika , Debashis Paul

Optimal sub-Gaussian variance proxy for truncated Gaussian and exponential random variables

This paper establishes the optimal sub-Gaussian variance proxy for truncated Gaussian and truncated exponential random variables. The proofs rely on first characterizing the optimal variance proxy as the unique solution to a set of two…

统计理论 · 数学 2024-11-27 Mathias Barreto , Olivier Marchal , Julyan Arbel

Confidence surfaces for the mean of locally stationary functional time series

The problem of constructing a simultaneous confidence surface for the 2-dimensional mean function of a non-stationary functional time series is challenging as these bands can not be built on classical limit theory for the maximum absolute…

统计理论 · 数学 2024-11-27 Holger Dette , Weichi Wu

Large dimensional Spearman's rank correlation matrices: The central limit theorem and its applications

This paper is concerned with Spearman's correlation matrices under large dimensional regime, in which the data dimension diverges to infinity proportionally with the sample size. We establish the central limit theorem for the linear…

统计理论 · 数学 2024-11-26 Hantao Chen , Cheng Wang

Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm

Federated Learning (FL) has gained significant recent attention in machine learning for its enhanced privacy and data security, making it indispensable in fields such as healthcare, finance, and personalized services. This paper…

统计理论 · 数学 2024-11-26 Jingyang Li , T. Tony Cai , Dong Xia , Anru R. Zhang

Extremal bounds for Gaussian trace estimation

This work derives extremal tail bounds for the Gaussian trace estimator applied to a real symmetric matrix. We define a partial ordering on the eigenvalues, so that when a matrix has greater spectrum under this ordering, its estimator will…

统计理论 · 数学 2024-11-26 Eric Hallman

Heavy-tailed Contamination is Easier than Adversarial Contamination

A large body of work in the statistics and computer science communities dating back to Huber (Huber, 1960) has led to statistically and computationally efficient outlier-robust estimators. Two particular outlier models have received…

统计理论 · 数学 2024-11-26 Yeshwanth Cherapanamjeri , Daniel Lee

Filtering and Statistical Properties of Unimodal Maps Perturbed by Heteroscedastic Noises

We propose a theory of unimodal maps perturbed by an heteroscedastic Markov chain noise and experiencing another heteroscedastic noise due to uncertain observation. We address and treat the filtering problem showing that by collecting more…

统计理论 · 数学 2024-11-26 Fabrizio Lillo , Stefano Marmi , Matteo Tanzi , Sandro Vaienti

Distribution-free tests for lossless feature selection in classification and regression

We study the problem of lossless feature selection for a $d$-dimensional feature vector $X=(X^{(1)},\dots ,X^{(d)})$ and label $Y$ for binary classification as well as nonparametric regression. For an index set $S\subset \{1,\dots ,d\}$,…

统计理论 · 数学 2024-11-26 László Györfi , Tamás Linder , Harro Walk

Sample Complexity of Probability Divergences under Group Symmetry

We rigorously quantify the improvement in the sample complexity of variational divergence estimations for group-invariant distributions. In the cases of the Wasserstein-1 metric and the Lipschitz-regularized $\alpha$-divergences, the…

统计理论 · 数学 2024-11-26 Ziyu Chen , Markos A. Katsoulakis , Luc Rey-Bellet , Wei Zhu

Generalized bootstrap in the Bures-Wasserstein space

This study focuses on finite-sample inference on the non-linear Bures-Wasserstein manifold and introduces a generalized bootstrap procedure for estimating Bures-Wasserstein barycenters. We provide non-asymptotic statistical guarantees for…

统计理论 · 数学 2024-11-26 Alexey Kroshnin , Vladimir Spokoiny , Alexandra Suvorikova

An asymptotically optimal Bernoulli factory for certain functions that can be expressed as power series

Given a sequence of independent Bernoulli variables with unknown parameter $p$, and a function $f$ expressed as a power series with non-negative coefficients that sum to at most $1$, an algorithm is presented that produces a Bernoulli…

统计理论 · 数学 2024-11-26 Luis Mendo

A note on the geodesic normal distribution on the sphere

This paper presents an alternative formulation of the geodesic normal distribution on the sphere, building on the work of Hauberg (2018). While the isotropic version of this distribution is naturally defined on the sphere, the anisotropic…

统计理论 · 数学 2024-11-25 José E. Chacón , Andrea Meilán-Vila

Tensors in algebraic statistics

Tensors are ubiquitous in statistics and data analysis. The central object that links data science to tensor theory and algebra is that of a model with latent variables. We provide an overview of tensor theory, with a particular emphasis on…

统计理论 · 数学 2024-11-22 Marta Casanellas , Luis Sierra , Piotr Zwiernik

Distributional regression: CRPS-error bounds for model fitting, model selection and convex aggregation

Distributional regression aims at estimating the conditional distribution of a targetvariable given explanatory co-variates. It is a crucial tool for forecasting whena precise uncertainty quantification is required. A popular methodology…

统计理论 · 数学 2024-11-22 Clément Dombry , Ahmed Zaoui

Active Subsampling for Measurement-Constrained M-Estimation of Individualized Thresholds with High-Dimensional Data

In the measurement-constrained problems, despite the availability of large datasets, we may be only affordable to observe the labels on a small portion of the large dataset. This poses a critical question that which data points are most…

统计理论 · 数学 2024-11-22 Jingyi Duan , Yang Ning