机器学习 — Scifaro

High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

We study the learning dynamics of a multi-pass, mini-batch Stochastic Gradient Descent (SGD) procedure for empirical risk minimization in high-dimensional multi-index models with isotropic random data. In an asymptotic regime where the…

机器学习 · 统计学 2026-02-19 Zhou Fan , Leda Wang

Boosting methods for interval-censored data with regression and classification

Boosting has garnered significant interest across both machine learning and statistical communities. Traditional boosting algorithms, designed for fully observed random samples, often struggle with real-world problems, particularly with…

机器学习 · 统计学 2026-02-19 Yuan Bian , Grace Y. Yi , Wenqing He

Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport

This thesis examines self-attention training through the lens of Optimal Transport (OT) and develops an OT-based alternative for tabular classification. The study tracks intermediate projections of the self-attention layer during training…

机器学习 · 统计学 2026-02-19 Alessandro Quadrio , Antonio Candelieri

High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes

We develop a high-dimensional scaling limit for Stochastic Gradient Descent with Polyak Momentum (SGD-M) and adaptive step-sizes. This provides a framework to rigourously compare online SGD with some of its popular variants. We show that…

机器学习 · 统计学 2026-02-19 Aukosh Jagannath , Taj Jones-McCormick , Varnan Sarangian

Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting

Probabilistic forecasting of multivariate time series is challenging due to non-stationarity, inter-variable dependencies, and distribution shifts. While recent diffusion and flow matching models have shown promise, they often ignore…

机器学习 · 统计学 2026-02-19 Yanfeng Yang , Siwei Chen , Pingping Hu , Zhaotong Shen , Yingjie Zhang , Zhuoran Sun , Shuai Li , Ziqi Chen , Kenji Fukumizu

Functional Central Limit Theorem for Stochastic Gradient Descent

We study the asymptotic shape of the trajectory of the stochastic gradient descent algorithm applied to a convex objective function. Under mild regularity assumptions, we prove a functional central limit theorem for the properly rescaled…

机器学习 · 统计学 2026-02-18 Kessang Flamand , Victor-Emmanuel Brunel

Sparse Additive Model Pruning for Order-Based Causal Structure Learning

Causal structure learning, also known as causal discovery, aims to estimate causal relationships between variables as a form of a causal directed acyclic graph (DAG) from observational data. One of the major frameworks is the order-based…

机器学习 · 统计学 2026-02-18 Kentaro Kanamori , Hirofumi Suzuki , Takuya Takagi

Universal priors: solving empirical Bayes via Bayesian inference and pretraining

We theoretically justify the recent empirical finding of [Teh et al., 2025] that a transformer pretrained on synthetically generated data achieves strong performance on empirical Bayes (EB) problems. We take an indirect approach to this…

机器学习 · 统计学 2026-02-18 Nick Cannella , Anzo Teh , Yanjun Han , Yury Polyanskiy

Amortised and provably-robust simulation-based inference

Complex simulator-based models are now routinely used to perform inference across the sciences and engineering, but existing inference methods are often unable to account for outliers and other extreme values in data which occur due to…

机器学习 · 统计学 2026-02-18 Ayush Bharti , Charita Dellaporta , Yuga Hikida , François-Xavier Briol

Quantifying Epistemic Uncertainty in Diffusion Models

To ensure high quality outputs, it is important to quantify the epistemic uncertainty of diffusion models. Existing methods are often unreliable because they mix epistemic and aleatoric uncertainty. We introduce a method based on Fisher…

机器学习 · 统计学 2026-02-18 Aditi Gupta , Raphael A. Meyer , Yotam Yaniv , Elynn Chen , N. Benjamin Erichson

Perfect Clustering for Sparse Directed Stochastic Block Models

Exact recovery in stochastic block models (SBMs) is well understood in undirected settings, but remains considerably less developed for directed and sparse networks, particularly when the number of communities diverges. Spectral methods for…

机器学习 · 统计学 2026-02-18 Behzad Aalipur , Yichen Qin

Counterfactual Survival Q-learning via Buckley-James Boosting, with Applications to ACTG 175 and CALGB 8923

We propose a Buckley James (BJ) Boost Q learning framework for estimating optimal dynamic treatment regimes from right censored survival outcomes in longitudinal randomized clinical trials, motivated by the clinical need to support patient…

机器学习 · 统计学 2026-02-18 Jeongjin Lee , Jong-Min Kim

Linear Bandits beyond Inner Product Spaces, the case of Bandit Optimal Transport

Linear bandits have long been a central topic in online learning, with applications ranging from recommendation systems to adaptive clinical trials. Their general learnability has been established when the objective is to minimise the inner…

机器学习 · 统计学 2026-02-18 Lorenzo Croissant

GenPANIS: A Latent-Variable Generative Framework for Forward and Inverse PDE Problems in Multiphase Media

Inverse problems and inverse design in multiphase media, i.e., recovering or engineering microstructures to achieve target macroscopic responses, require operating on discrete-valued material fields, rendering the problem non-differentiable…

机器学习 · 统计学 2026-02-17 Matthaios Chatzopoulos , Phaedon-Stelios Koutsourelakis

Accelerating Posterior Inference from Pulsar Light Curves via Learned Latent Representations and Local Simulator-Guided Optimization

Posterior inference from pulsar observations in the form of light curves is commonly performed using Markov chain Monte Carlo methods, which are accurate but computationally expensive. We introduce a framework that accelerates posterior…

机器学习 · 统计学 2026-02-17 Farhana Taiyebah , Abu Bucker Siddik , Indronil Bhattacharjee , Diane Oyen , Soumi De , Greg Olmschenk , Constantinos Kalapotharakos

Constrained and Composite Sampling via Proximal Sampler

We study two log-concave sampling problems: constrained sampling and composite sampling. First, we consider sampling from a target distribution with density proportional to $\exp(-f(x))$ supported on a convex set $K \subset \mathbb{R}^d$,…

机器学习 · 统计学 2026-02-17 Thanh Dang , Jiaming Liang

Federated Ensemble Learning with Progressive Model Personalization

Federated Learning provides a privacy-preserving paradigm for distributed learning, but suffers from statistical heterogeneity across clients. Personalized Federated Learning (PFL) mitigates this issue by considering client-specific models.…

机器学习 · 统计学 2026-02-17 Ala Emrani , Amir Najafi , Abolfazl Motahari

Why Self-Training Helps and Hurts: Denoising vs. Signal Forgetting

Iterative self-training (self-distillation) repeatedly refits a model on pseudo-labels generated by its own predictions. We study this procedure in overparameterized linear regression: an initial estimator is trained on noisy labels, and…

机器学习 · 统计学 2026-02-17 Mingqi Wu , Archer Y. Yang , Qiang Sun

Computable Bernstein Certificates for Cross-Fitted Clipped Covariance Estimation

We study operator-norm covariance estimation from heavy-tailed samples that may include a small fraction of arbitrary outliers. A simple and widely used safeguard is \emph{Euclidean norm clipping}, but its accuracy depends critically on an…

机器学习 · 统计学 2026-02-17 Even He , Zaizai Yan

A Theoretical Framework for LLM Fine-tuning Using Early Stopping for Non-random Initialization

In the era of large language models (LLMs), fine-tuning pretrained models has become ubiquitous. Yet the theoretical underpinning remains an open question. A central question is why only a few epochs of fine-tuning are typically sufficient…

机器学习 · 统计学 2026-02-17 Zexuan Sun , Garvesh Raskutti