机器学习 — Scifaro

Infinite-dimensional generative diffusions via Doob's h-transform

This paper introduces a rigorous framework for defining generative diffusion models in infinite dimensions via Doob's h-transform. Rather than relying on time reversal of a noising process, a reference diffusion is forced towards the target…

机器学习 · 统计学 2026-02-09 Thorben Pieper-Sethmacher , Daniel Paulin

Operationalizing Stein's Method for Online Linear Optimization: CLT-Based Optimal Tradeoffs

Adversarial online linear optimization (OLO) is essentially about making performance tradeoffs with respect to the unknown difficulty of the adversary. In the setting of one-dimensional fixed-time OLO on a bounded domain, it has been…

机器学习 · 统计学 2026-02-09 Zhiyu Zhang , Aaditya Ramdas

Revisiting the Sliced Wasserstein Kernel for persistence diagrams: a Figalli-Gigli approach

The Sliced Wasserstein Kernel (SWK) for persistence diagrams was introduced in (Carri{\`e}re et al. 2017) as a powerful tool to implicitly embed persistence diagrams in a Hilbert space with reasonable distortion. This kernel is built on the…

机器学习 · 统计学 2026-02-09 Marc Janthial , Théo Lacombe

Time-uniform conformal and PAC prediction

Given that machine learning algorithms are increasingly being deployed to aid in high stakes decision-making, uncertainty quantification methods that wrap around these black box models such as conformal prediction have received much…

机器学习 · 统计学 2026-02-09 Kayla E. Scharfstein , Arun Kumar Kuchibhotla

Inheritance Between Feedforward and Convolutional Networks via Model Projection

Techniques for feedforward networks (FFNs) and convolutional networks (CNNs) are frequently reused across families, but the relationship between the underlying model classes is rarely made explicit. We introduce a unified node-level…

机器学习 · 统计学 2026-02-09 Nicolas Ewen , Jairo Diaz-Rodriguez , Kelly Ramsay

Radon--Wasserstein Gradient Flows for Interacting-Particle Sampling in High Dimensions

Gradient flows of the Kullback--Leibler (KL) divergence, such as the Fokker--Planck equation and Stein Variational Gradient Descent, evolve a distribution toward a target density known only up to a normalizing constant. We introduce new…

机器学习 · 统计学 2026-02-09 Elias Hess-Childs , Dejan Slepčev , Lantian Xu

Performative Learning Theory

Performative predictions influence the very outcomes they aim to forecast. We study performative predictions that affect a sample (e.g., only existing users of an app) and/or the whole population (e.g., all potential app users). This raises…

机器学习 · 统计学 2026-02-09 Julian Rodemann , Unai Fischer-Abaigar , James Bailie , Krikamol Muandet

FreDN: Spectral Disentanglement for Time Series Forecasting via Learnable Frequency Decomposition

Time series forecasting is essential in a wide range of real world applications. Recently, frequency-domain methods have attracted increasing interest for their ability to capture global dependencies. However, when applied to non-stationary…

机器学习 · 统计学 2026-02-09 Zhongde An , Jinhong You , Jiyanglin Li , Yiming Tang , Wen Li , Heming Du , Shouguo Du

Optimal Bias-variance Tradeoff in Matrix and Tensor Estimation

We study matrix and tensor denoising when the underlying signal is \textbf{not} necessarily low-rank. In the tensor setting, we observe \[ Y = X^\ast + Z \in \mathbb{R}^{p_1 \times p_2 \times p_3}, \] where $X^\ast$ is an unknown signal…

机器学习 · 统计学 2026-02-09 Shivam Kumar , Xiaokai Luo , Haotian Xu , Carlos Misael Madrid Padilla , Oscar Hernan Madrid Padilla , Daren Wang

Efficient Perplexity Bound and Ratio Matching in Discrete Diffusion Language Models

While continuous diffusion models excel in modeling continuous distributions, their application to categorical data has been less effective. Recent work has shown that ratio-matching through score-entropy within a continuous-time discrete…

机器学习 · 统计学 2026-02-09 Etrit Haxholli , Yeti Z. Gurbuz , Ogul Can , Eli Waxman

Training-Conditional Coverage Bounds under Covariate Shift

Conformal prediction methodology has recently been extended to the covariate shift setting, where the distribution of covariates differs between training and test data. While existing results ensure that the prediction sets from these…

机器学习 · 统计学 2026-02-09 Mehrdad Pournaderi , Yu Xiang

ProDAG: Projected Variational Inference for Directed Acyclic Graphs

Directed acyclic graph (DAG) learning is a central task in structure discovery and causal inference. Although the field has witnessed remarkable advances over the past few years, it remains statistically and computationally challenging to…

机器学习 · 统计学 2026-02-09 Ryan Thompson , Edwin V. Bonilla , Robert Kohn

Causal Inference on Stopped Random Walks in Online Advertising

We consider a causal inference problem frequently encountered in online advertising systems, where a publisher (e.g., Instagram, TikTok) interacts repeatedly with human users and advertisers by sporadically displaying to each user an…

机器学习 · 统计学 2026-02-06 Jia Yuan Yu

Transformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical Consequences

Transformers underpin modern large language models (LLMs) and are commonly assumed to be behaviorally unstructured at random initialization, with all meaningful preferences emerging only through large-scale training. We challenge this…

机器学习 · 统计学 2026-02-06 Siquan Li , Yao Tong , Haonan Wang , Tianyang Hu

Wedge Sampling: Efficient Tensor Completion with Nearly-Linear Sample Complexity

We introduce Wedge Sampling, a new non-adaptive sampling scheme for low-rank tensor completion. We study recovery of an order-$k$ low-rank tensor of dimension $n \times \cdots \times n$ from a subset of its entries. Unlike the standard…

机器学习 · 统计学 2026-02-06 Hengrui Luo , Anna Ma , Ludovic Stephan , Yizhe Zhu

Optimal scaling laws in learning hierarchical multi-index models

In this work, we provide a sharp theory of scaling laws for two-layer neural networks trained on a class of hierarchical multi-index targets, in a genuinely representation-limited regime. We derive exact information-theoretic scaling laws…

机器学习 · 统计学 2026-02-06 Leonardo Defilippis , Florent Krzakala , Bruno Loureiro , Antoine Maillard

Optimal Bayesian Stopping for Efficient Inference of Consistent LLM Answers

A simple strategy for improving LLM accuracy, especially in math and reasoning problems, is to sample multiple responses and submit the answer most consistently reached. In this paper we leverage Bayesian prior information to save on…

机器学习 · 统计学 2026-02-06 Jingkai Huang , Will Ma , Zhengyuan Zhou

Variance Reduction Based Experience Replay for Policy Optimization

Effective reinforcement learning (RL) for complex stochastic systems requires leveraging historical data collected in previous iterations to accelerate policy optimization. Classical experience replay treats all past observations uniformly…

机器学习 · 统计学 2026-02-06 Hua Zheng , Wei Xie , M. Ben Feng , Keilung Choy

Decision-Focused Sequential Experimental Design: A Directional Uncertainty-Guided Approach

We consider the sequential experimental design problem in the predict-then-optimize paradigm. In this paradigm, the outputs of the prediction model are used as coefficient vectors in a downstream linear optimization problem. Traditional…

机器学习 · 统计学 2026-02-06 Beichen Wan , Mo Liu , Paul Grigas , Zuo-Jun Max Shen

Total Variation Rates for Riemannian Flow Matching

Riemannian flow matching (RFM) extends flow-based generative modeling to data supported on manifolds by learning a time-dependent tangent vector field whose flow-ODE transports a simple base distribution to the data law. We develop a…

机器学习 · 统计学 2026-02-06 Yunrui Guan , Krishnakumar Balasubramanian , Shiqian Ma