统计理论 — Scifaro

The Hidden Toll of COVID-19 on Opioid Mortality in Georgia: A Bayesian Excess Opioid Mortality Analysis

COVID-19 has had a large scale negative impact on the health of opioid users exacerbating the health of an already vulnerable population. Critical information on the total impact of COVID-19 on opioid users is unknown due to a lack of…

统计理论 · 数学 2025-03-24 Cyen J. Peterkin , Lance A. Waller , Emily N. Peterson

Sobolev Calibration of Imperfect Computer Models

Calibration refers to the statistical estimation of unknown model parameters in computer experiments, such that computer experiments can match underlying physical systems. This work develops a new calibration method for imperfect computer…

统计理论 · 数学 2025-03-24 Qingwen Zhang , Wenjia Wang

Spectral gap bounds for reversible hybrid Gibbs chains

Hybrid Gibbs samplers represent a prominent class of approximated Gibbs algorithms that utilize Markov chains to approximate conditional distributions, with the Metropolis-within-Gibbs algorithm standing out as a well-known example. Despite…

统计理论 · 数学 2025-03-24 Qian Qin , Nianqiao Ju , Guanyang Wang

Statistical accuracy of the ensemble Kalman filter in the near-linear setting

Estimating the state of a dynamical system from partial and noisy observations is a ubiquitous problem in a large number of applications, such as probabilistic weather forecasting and prediction of epidemics. Particle filters are a widely…

统计理论 · 数学 2025-03-21 E. Calvello , J. A. Carrillo , F. Hoffmann , P. Monmarché , A. M. Stuart , U. Vaes

The Fundamental Limits of Recovering Planted Subgraphs

Given an arbitrary subgraph $H=H_n$ and $p=p_n \in (0,1)$, the planted subgraph model is defined as follows. A statistician observes the union a random copy $H^*$ of $H$, together with random noise in the form of an instance of an…

统计理论 · 数学 2025-03-21 Daniel Lee , Francisco Pernice , Amit Rajaraman , Ilias Zadik

On the Functoriality of Belief Propagation Algorithms on finite Partially Ordered Sets

Undirected graphical models are a widely used class of probabilistic models in machine learning that capture prior knowledge or putative pairwise interactions between variables. Those interactions are encoded in a graph for pairwise…

统计理论 · 数学 2025-03-21 Grégoire Sergeant-Perthuis , Toby St Clere Smithe , Léo Boitel

Asymptotic non-linear shrinkage and eigenvector overlap for weighted sample covariance

We compute asymptotic non-linear shrinkage formulas for covariance and precision matrix estimators for weighted sample covariances, and the joint sample-population eigenvector overlap distribution, in the spirit of Ledoit and P\'ech\'e. We…

统计理论 · 数学 2025-03-21 Benoit Oriol

Combining exchangeable p-values

The problem of combining p-values is an old and fundamental one, and the classic assumption of independence is often violated or unverifiable in many applications. There are many well-known rules that can combine a set of arbitrarily…

统计理论 · 数学 2025-03-21 Matteo Gasparin , Ruodu Wang , Aaditya Ramdas

Two improved algorithms for sparse generalized canonical correlation analysis

Regularized generalized canonical correlation analysis (RGCCA) is a generalization of regularized canonical correlation analysis to three or more sets of variables, which is a component-based approach aiming to study the relationships…

统计理论 · 数学 2025-03-21 Kuo-Yue Li , Qi-Ye Zhang , Yong-Han Sun

Asymptotically Optimal Sequential Multiple Testing Procedures for Correlated Normal

Simultaneous statistical inference has been a cornerstone in the statistics methodology literature because of its fundamental theory and paramount applications. The mainstream multiple testing literature has traditionally considered two…

统计理论 · 数学 2025-03-21 Monitirtha Dey , Subir Kumar Bhandari

A Shrinkage Likelihood Ratio Test for High-Dimensional Subgroup Analysis with a Logistic-Normal Mixture Model

In subgroup analysis, testing the existence of a subgroup with a differential treatment effect serves as protection against spurious subgroup discovery. Despite its importance, this hypothesis testing possesses a complicated nature:…

统计理论 · 数学 2025-03-21 Shota Takeishi

A re-examination to the SCoTLASS problems for SPCA and two projection-based methods for them

SCoTLASS is the first sparse principal component analysis (SPCA) model which imposes extra l1 norm constraints on the measured variables to obtain sparse loadings. Due to the the difficulty of finding projections on the intersection of an…

统计理论 · 数学 2025-03-21 Qiye Zhang , Kuoyue Li

Uniformly consistent proportion estimation for composite hypotheses via integral equations: "the case of Gamma random variables"

We consider estimating the proportion of random variables for two types of composite null hypotheses: (i) the means of the random variables belonging to a non-empty, bounded interval; (ii) the means of the random variables belonging to an…

统计理论 · 数学 2025-03-21 Xiongzhi Chen

A Note on Local Linear Regression for Time Series in Banach Spaces

This work extends local linear regression to Banach space-valued time series for estimating smoothly varying means and their derivatives in non-stationary data. The asymptotic properties of both the standard and bias-reduced Jackknife…

统计理论 · 数学 2025-03-20 Florian Heinrichs

The Field Equations of Penalized non-Parametric Regression

We view penalized risks through the lens of the calculus of variations. We consider risks comprised of a fitness-term (e.g. MSE) and a gradient-based penalty. After establishing the Euler-Lagrange field equations as a systematic approach to…

统计理论 · 数学 2025-03-20 Sven Pappert

Hazard Rate for Associated Data in Deconvolution Problems: Asymptotic Normality

In reliability theory and survival analysis, observed data are often weakly dependent and subject to additive measurement errors. Such contamination arises when the underlying data are neither independent nor strongly mixed but instead…

统计理论 · 数学 2025-03-20 Benjrada Mohammed Essalih

On the Precise Asymptotics of Universal Inference

In statistical inference, confidence set procedures are typically evaluated based on their validity and width properties. Even when procedures achieve rate-optimal widths, confidence sets can still be excessively wide in practice due to…

统计理论 · 数学 2025-03-20 Kenta Takatsu

The broken sample problem revisited: Proof of a conjecture by Bai-Hsing and high-dimensional extensions

We revisit the classical broken sample problem: Two samples of i.i.d. data points $\mathbf{X}=\{X_1,\cdots, X_n\}$ and $\mathbf{Y}=\{Y_1,\cdots,Y_m\}$ are observed without correspondence with $m\leq n$. Under the null hypothesis,…

统计理论 · 数学 2025-03-20 Simiao Jiao , Yihong Wu , Jiaming Xu

A Unified Framework for Semiparametrically Efficient Semi-Supervised Learning

We consider statistical inference under a semi-supervised setting where we have access to both a labeled dataset consisting of pairs $\{X_i, Y_i \}_{i=1}^n$ and an unlabeled dataset $\{ X_i \}_{i=n+1}^{n+N}$. We ask the question: under what…

统计理论 · 数学 2025-03-20 Zichun Xu , Daniela Witten , Ali Shojaie

Rapid Bayesian Computation and Estimation for Neural Networks via Log-Concave Coupling

This paper studies a Bayesian estimation procedure for single-hidden-layer neural networks using $\ell_{1}$ controlled weights. We study the structure of the posterior density and provide a representation that makes it amenable to rapid…

统计理论 · 数学 2025-03-20 Curtis McDonald , Andrew R. Barron