机器学习 — Scifaro

Convergence of Muon with Newton-Schulz

We analyze Muon as originally proposed and used in practice -- using the momentum orthogonalization with a few Newton-Schulz steps. The prior theoretical results replace this key step in Muon with an exact SVD-based polar factor. We prove…

机器学习 · 统计学 2026-01-28 Gyu Yeol Kim , Min-hwan Oh

Collaborative Compressors in Distributed Mean Estimation with Limited Communication Budget

Distributed high dimensional mean estimation is a common aggregation routine used often in distributed optimization methods. Most of these applications call for a communication-constrained setting where vectors, whose mean is to be…

机器学习 · 统计学 2026-01-28 Harsh Vardhan , Arya Mazumdar

Implicit Q-Learning and SARSA: Liberating Policy Control from Step-Size Calibration

Q-learning and SARSA are foundational reinforcement learning algorithms whose practical success depends critically on step-size calibration. Step-sizes that are too large can cause numerical instability, while step-sizes that are too small…

机器学习 · 统计学 2026-01-28 Hwanwoo Kim , Eric Laber

A general framework for adaptive nonparametric dimensionality reduction

Dimensionality reduction is a fundamental task in modern data science. Several projection methods specifically tailored to take into account the non-linearity of the data via local embeddings have been proposed. Such methods are often based…

机器学习 · 统计学 2026-01-28 Antonio Di Noia , Federico Ravenda , Antonietta Mira

Doubly-Regressing Approach for Subgroup Fairness

Algorithmic fairness is a socially crucial topic in real-world applications of AI. Among many notions of fairness, subgroup fairness is widely studied when multiple sensitive attributes (e.g., gender, race, age) are present. However, as the…

机器学习 · 统计学 2026-01-28 Kunwoong Kim , Kyungseon Lee , Jihu Lee , Dongyoon Yang , Yongdai Kim

Concept activation vectors: a unifying view and adversarial attacks

Concept Activation Vectors (CAVs) are a tool from explainable AI, offering a promising approach for understanding how human-understandable concepts are encoded in a model's latent spaces. They are computed from hidden-layer activations of…

机器学习 · 统计学 2026-01-28 Ekkehard Schnoor , Malik Tiomoko , Jawher Said , Alex Jung , Wojciech Samek

Incorporating priors in learning: a random matrix study under a teacher-student framework

Regularized linear regression is central to machine learning, yet its high-dimensional behavior with informative priors remains poorly understood. We provide the first exact asymptotic characterization of training and test risks for maximum…

机器学习 · 统计学 2026-01-28 Malik Tiomoko , Ekkehard Schnoor

Bilateral Distribution Compression: Reducing Both Data Size and Dimensionality

Existing distribution compression methods reduce the number of observations in a dataset by minimising the Maximum Mean Discrepancy (MMD) between original and compressed sets, but modern datasets are often large in both sample size and…

机器学习 · 统计学 2026-01-28 Dominic Broadbent , Nick Whiteley , Robert Allison , Tom Lovett

Global Minimizers of $\ell^p$-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Overparameterized neural networks can interpolate a given dataset in many different ways, prompting the fundamental question: which among these solutions should we prefer, and what explicit regularization strategies will provably yield…

机器学习 · 统计学 2026-01-28 Julia Nakhleh , Robert D. Nowak

Uncertainty-Aware Surrogate-based Amortized Bayesian Inference for Computationally Expensive Models

Bayesian inference typically relies on a large number of model evaluations to estimate posterior distributions. Established methods like Markov Chain Monte Carlo (MCMC) and Amortized Bayesian Inference (ABI) can become computationally…

机器学习 · 统计学 2026-01-28 Stefania Scheurer , Philipp Reiser , Tim Brünnette , Wolfgang Nowak , Anneli Guthke , Paul-Christian Bürkner

Forecasting intermittent time series with Gaussian Processes and Tweedie likelihood

We adopt Gaussian Processes (GPs) as latent functions for probabilistic forecasting of intermittent time series. The model is trained in a Bayesian framework that accounts for the uncertainty about the latent function. We couple the latent…

机器学习 · 统计学 2026-01-28 Stefano Damato , Dario Azzimonti , Giorgio Corani

Accelerating Ill-conditioned Hankel Matrix Recovery via Structured Newton-like Descent

This paper studies the robust Hankel recovery problem, which simultaneously removes the sparse outliers and fulfills missing entries from the partial observation. We propose a novel non-convex algorithm, coined Hankel Structured Newton-Like…

机器学习 · 统计学 2026-01-28 HanQin Cai , Longxiu Huang , Xiliang Lu , Juntao You

Out-of-Distribution Radar Detection with Complex VAEs: Theory, Whitening, and ANMF Fusion

We investigate the detection of weak complex-valued signals immersed in non-Gaussian, range-varying interference, with emphasis on maritime radar scenarios. The proposed methodology exploits a Complex-valued Variational AutoEncoder (CVAE)…

机器学习 · 统计学 2026-01-27 Yadang Alexis Rouzoumka , Jean Pinsolle , Eugénie Terreaux , Christèle Morisseau , Jean-Philippe Ovarlez , Chengfang Ren

Exact Minimum-Volume Confidence Set Intersection for Multinomial Outcomes

Computation of confidence sets is central to data science and machine learning, serving as the workhorse of A/B testing and underpinning the operation and analysis of reinforcement learning algorithms. Among all valid confidence sets for…

机器学习 · 统计学 2026-01-27 Heguang Lin , Binhao Chen , Mengze Li , Daniel Pimentel-Alarcón , Matthew L. Malloy

Nonlinear multi-study factor analysis

High-dimensional data often exhibit variation that can be captured by lower dimensional factors. For high-dimensional data from multiple studies or environments, one goal is to understand which underlying factors are common to all studies,…

机器学习 · 统计学 2026-01-27 Gemma E. Moran , Anandi Krishnan

A Cherry-Picking Approach to Large Load Shaping for More Effective Carbon Reduction

Shaping multi-megawatt loads, such as data centers, impacts generator dispatch on the electric grid, which in turn affects system CO2 emissions and energy cost. Substantiating the effectiveness of prevalent load shaping strategies, such as…

机器学习 · 统计学 2026-01-27 Bokan Chen , Raiden Hasegawa , Adriaan Hilbers , Ross Koningstein , Ana Radovanović , Utkarsh Shah , Gabriela Volpato , Mohamed Ahmed , Tim Cary , Rod Frowd

"Rebuilding" Statistics in the Age of AI: A Town Hall Discussion on Culture, Infrastructure, and Training

This article presents the full, original record of the 2024 Joint Statistical Meetings (JSM) town hall, "Statistics in the Age of AI," which convened leading statisticians to discuss how the field is evolving in response to advances in…

机器学习 · 统计学 2026-01-27 David L. Donoho , Jian Kang , Xihong Lin , Bhramar Mukherjee , Dan Nettleton , Rebecca Nugent , Abel Rodriguez , Eric P. Xing , Tian Zheng , Hongtu Zhu

Approximate full conformal prediction in an RKHS

Full conformal prediction is a framework that implicitly formulates distribution-free confidence prediction regions for a wide range of estimators. However, a classical limitation of the full conformal framework is the computation of the…

机器学习 · 统计学 2026-01-27 Davidson Lova Razafindrakoto , Alain Celisse , Jérôme Lacaille

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Amortized Bayesian inference (ABI) offers fast, scalable approximations to posterior densities by training neural surrogates on data simulated from the statistical model. However, ABI methods are highly sensitive to model misspecification:…

机器学习 · 统计学 2026-01-27 Šimon Kucharský , Aayush Mishra , Daniel Habermann , Stefan T. Radev , Paul-Christian Bürkner

Thermodynamic structure of the Sinkhorn flow

Entropy-regularized optimal transport, which has strong links to the Schr\"odinger bridge problem in statistical mechanics, enjoys a variety of applications from trajectory inference to generative modeling. A major driver of renewed…

机器学习 · 统计学 2026-01-27 Anand Srinivasan , Jean-Jacques Slotine