机器学习 — Scifaro

Anti-causal domain generalization: Leveraging unlabeled data

The problem of domain generalization concerns learning predictive models that are robust to distribution shifts when deployed in new, previously unseen environments. Existing methods typically require labeled data from multiple training…

机器学习 · 统计学 2026-02-20 Sorawit Saengkyongam , Juan L. Gamella , Andrew C. Miller , Jonas Peters , Nicolai Meinshausen , Christina Heinze-Deml

Semi-Supervised Learning on Graphs using Graph Neural Networks

Graph neural networks (GNNs) work remarkably well in semi-supervised node regression, yet a rigorous theory explaining when and why they succeed remains lacking. To address this gap, we study an aggregate-and-readout model that encompasses…

机器学习 · 统计学 2026-02-20 Juntong Chen , Claire Donnat , Olga Klopp , Johannes Schmidt-Hieber

Poisson-MNL Bandit: Nearly Optimal Dynamic Joint Assortment and Pricing with Decision-Dependent Customer Arrivals

We study dynamic joint assortment and pricing where a seller updates decisions at regular accounting/operating intervals to maximize the cumulative per-period revenue over a horizon $T$. In many settings, assortment and prices affect not…

机器学习 · 统计学 2026-02-20 Junhui Cai , Ran Chen , Qitao Huang , Linda Zhao , Wu Zhu

Beyond Procedure: Substantive Fairness in Conformal Prediction

Conformal prediction (CP) offers distribution-free uncertainty quantification for machine learning models, yet its interplay with fairness in downstream decision-making remains underexplored. Moving beyond CP as a standalone operation…

机器学习 · 统计学 2026-02-20 Pengqi Liu , Zijun Yu , Mouloud Belbahri , Arthur Charpentier , Masoud Asgharian , Jesse C. Cresswell

On sparsity, extremal structure, and monotonicity properties of Wasserstein and Gromov-Wasserstein optimal transport plans

This note gives a self-contained overview of some important properties of the Gromov-Wasserstein (GW) distance, compared with the standard linear optimal transport (OT) framework. More specifically, I explore the following questions: are GW…

机器学习 · 统计学 2026-02-20 Titouan Vayer

Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks

In this work, we propose a notion of practical learnability grounded in finite sample settings, and develop a conjugate learning theoretical framework based on convex conjugate duality to characterize this learnability property. Building on…

机器学习 · 统计学 2026-02-20 Binchuan Qi

Fixed Budget is No Harder Than Fixed Confidence in Best-Arm Identification up to Logarithmic Factors

The best-arm identification (BAI) problem is one of the most fundamental problems in interactive machine learning, which has two flavors: the fixed-budget setting (FB) and the fixed-confidence setting (FC). For $K$-armed bandits with the…

机器学习 · 统计学 2026-02-20 Kapilan Balagopalan , Yinan Li , Yao Zhao , Tuan Nguyen , Anton Daitche , Houssam Nassif , Kwang-Sung Jun

Beyond Predictive Uncertainty: Reliable Representation Learning with Structural Constraints

Uncertainty estimation in machine learning has traditionally focused on the prediction stage, aiming to quantify confidence in model outputs while treating learned representations as deterministic and reliable by default. In this work, we…

机器学习 · 统计学 2026-02-20 Yiyao Yang

Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance

Random feature models (RFMs), two-layer networks with a randomly initialized fixed first layer and a trained linear readout, are among the simplest nonlinear predictors. Prior asymptotic analyses in the proportional high-dimensional regime…

机器学习 · 统计学 2026-02-20 Samet Demir , Zafer Dogan

Enhanced Diffusion Sampling: Efficient Rare Event Sampling and Free Energy Calculation with Diffusion Models

The rare-event sampling problem has long been the central limiting factor in molecular dynamics (MD), especially in biomolecular simulation. Recently, diffusion models such as BioEmu have emerged as powerful equilibrium samplers that…

机器学习 · 统计学 2026-02-19 Yu Xie , Ludwig Winkler , Lixin Sun , Sarah Lewis , Adam E. Foster , José Jiménez Luna , Tim Hempel , Michael Gastegger , Yaoyi Chen , Iryna Zaporozhets , Cecilia Clementi , Christopher M. Bishop , Frank Noé

Error Propagation and Model Collapse in Diffusion Models: A Theoretical Study

Machine learning models are increasingly trained or fine-tuned on synthetic data. Recursively training on such data has been observed to significantly degrade performance in a wide range of tasks, often characterized by a progressive drift…

机器学习 · 统计学 2026-02-19 Nail B. Khelifa , Richard E. Turner , Ramji Venkataramanan

Functional Decomposition and Shapley Interactions for Interpreting Survival Models

Hazard and survival functions are natural, interpretable targets in time-to-event prediction, but their inherent non-additivity fundamentally limits standard additive explanation methods. We introduce Survival Functional Decomposition…

机器学习 · 统计学 2026-02-19 Sophie Hanna Langbein , Hubert Baniecki , Fabian Fumagalli , Niklas Koenen , Marvin N. Wright , Julia Herbinger

Learning Preference from Observed Rankings

Estimating consumer preferences is central to many problems in economics and marketing. This paper develops a flexible framework for learning individual preferences from partial ranking information by interpreting observed rankings as…

机器学习 · 统计学 2026-02-19 Yu-Chang Chen , Chen Chian Fuh , Shang En Tsai

Machine Learning in Epidemiology

In the age of digital epidemiology, epidemiologists are faced by an increasing amount of data of growing complexity and dimensionality. Machine learning is a set of powerful tools that can help to analyze such enormous amounts of data. This…

机器学习 · 统计学 2026-02-19 Marvin N. Wright , Lukas Burk , Pegah Golchian , Jan Kapar , Niklas Koenen , Sophie Hanna Langbein

Empirical Cumulative Distribution Function Clustering for LLM-based Agent System Analysis

Large language models (LLMs) are increasingly used as agents to solve complex tasks such as question answering (QA), scientific debate, and software development. A standard evaluation procedure aggregates multiple responses from LLM agents…

机器学习 · 统计学 2026-02-19 Chihiro Watanabe , Jingyu Sun

Partial Identification under Missing Data Using Weak Shadow Variables from Pretrained Models

Estimating population quantities such as mean outcomes from user feedback is fundamental to platform evaluation and social science, yet feedback is often missing not at random (MNAR): users with stronger opinions are more likely to respond,…

机器学习 · 统计学 2026-02-19 Hongyu Chen , David Simchi-Levi , Ruoxuan Xiong

Robust Stochastic Gradient Posterior Sampling with Lattice Based Discretisation

Stochastic-gradient MCMC methods enable scalable Bayesian posterior sampling but often suffer from sensitivity to minibatch size and gradient noise. To address this, we propose Stochastic Gradient Lattice Random Walk (SGLRW), an extension…

机器学习 · 统计学 2026-02-19 Zier Mensch , Lars Holdijk , Samuel Duffield , Maxwell Aifer , Patrick J. Coles , Max Welling , Miranda C. N. Cheng

Including Node Textual Metadata in Laplacian-constrained Gaussian Graphical Models

This paper addresses graph learning in Gaussian Graphical Models (GGMs). In this context, data matrices often come with auxiliary metadata (e.g., textual descriptions associated with each node) that is usually ignored in traditional graph…

机器学习 · 统计学 2026-02-19 Jianhua Wang , Killian Cressant , Pedro Braconnot Velloso , Arnaud Breloy

From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

The problem of model collapse has presented new challenges in iterative training of generative models, where such training with synthetic data leads to an overall degradation of performance. This paper looks at the problem from a…

机器学习 · 统计学 2026-02-19 Soham Bakshi , Sunrit Chakraborty

Logarithmic-time Schedules for Scaling Language Models with Momentum

In practice, the hyperparameters $(\beta_1, \beta_2)$ and weight-decay $\lambda$ in AdamW are typically kept at fixed values. Is there any reason to do otherwise? We show that for large-scale language model training, the answer is yes: by…

机器学习 · 统计学 2026-02-19 Damien Ferbach , Courtney Paquette , Gauthier Gidel , Katie Everett , Elliot Paquette