机器学习 — Scifaro

TFTF: Training-Free Targeted Flow for Conditional Sampling

We propose a training-free conditional sampling method for flow matching models based on importance sampling. Because a na\"ive application of importance sampling suffers from weight degeneracy in high-dimensional settings, we modify and…

机器学习 · 统计学 2026-02-16 Qianqian Qu , Jun S. Liu

Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures

Mode collapse, the failure to capture one or more modes when targetting a multimodal distribution, is a central challenge in modern variational inference. In this work, we provide a mathematical analysis of annealing based strategies for…

机器学习 · 统计学 2026-02-16 Luigi Fogliani , Bruno Loureiro , Marylou Gabrié

Blessings of Multiple Good Arms in Multi-Objective Linear Bandits

The multi objective bandit setting has traditionally been regarded as more complex than the single objective case, as multiple objectives must be optimized simultaneously. In contrast to this prevailing view, we demonstrate that when…

机器学习 · 统计学 2026-02-16 Heesang Ann , Min-hwan Oh

A Regularization-Sharpness Tradeoff for Linear Interpolators

The rule of thumb regarding the relationship between the bias-variance tradeoff and model size plays a key role in classical machine learning, but is now well-known to break down in the overparameterized setting as per the double descent…

机器学习 · 统计学 2026-02-16 Qingyi Hu , Liam Hodgkinson

The Implicit Bias of Logit Regularization

Logit regularization, the addition of a convex penalty directly in logit space, is widely used in modern classifiers, with label smoothing as a prominent example. While such methods often improve calibration and generalization, their…

机器学习 · 统计学 2026-02-16 Alon Beck , Yohai Bar Sinai , Noam Levi

The Critical Horizon: Inspection Design Principles for Multi-Stage Operations and Deep Reasoning

Manufacturing lines, service journeys, supply chains, and AI reasoning chains share a common challenge: attributing a terminal outcome to the intermediate stage that caused it. We establish an information-theoretic barrier to this credit…

机器学习 · 统计学 2026-02-16 Seyed Morteza Emadi

ROOFS: RObust biOmarker Feature Selection

Feature selection (FS) is essential for biomarker discovery and clinical predictive modeling. Over the past decades, methodological literature on FS has become rich and mature, offering a wide spectrum of algorithmic approaches. However,…

机器学习 · 统计学 2026-02-16 Anastasiia Bakhmach , Paul Dufossé , Andrea Vaglio , Florence Monville , Laurent Greillier , Fabrice Barlési , Sébastien Benzekry

Variational phylogenetic inference with products over bipartitions

Bayesian phylogenetics is vital for understanding evolutionary dynamics, and requires accurate and efficient approximation of posterior distributions over trees. In this work, we develop a variational Bayesian approach for ultrametric…

机器学习 · 统计学 2026-02-16 Evan Sidrow , Alexandre Bouchard-Côté , Lloyd T. Elliott

Towards Representation Learning for Weighting Problems in Design-Based Causal Inference

Reweighting a distribution to minimize a distance to a target distribution is a powerful and flexible strategy for estimating a wide range of causal effects, but can be challenging in practice because optimal weights typically depend on…

机器学习 · 统计学 2026-02-16 Oscar Clivio , Avi Feller , Chris Holmes

Online Tensor Inference

Contemporary applications, such as recommendation systems and mobile health monitoring, require real-time processing and analysis of sequentially arriving high-dimensional tensor data. Traditional offline learning, involving the storage and…

机器学习 · 统计学 2026-02-16 Xin Wen , Will Wei Sun , Yichen Zhang

AutoLL: Automatic Linear Layout of Graphs based on Deep Neural Network

Linear layouts are a graph visualization method that can be used to capture an entry pattern in an adjacency matrix of a given graph. By reordering the node indices of the original adjacency matrix, linear layouts provide knowledge of…

机器学习 · 统计学 2026-02-16 Chihiro Watanabe , Taiji Suzuki

PAC-Bayesian Generalization Guarantees for Fairness on Stochastic and Deterministic Classifiers

Classical PAC generalization bounds on the prediction risk of a classifier are insufficient to provide theoretical guarantees on fairness when the goal is to learn models balancing predictive risk and fairness constraints. We propose a…

机器学习 · 统计学 2026-02-13 Julien Bastian , Benjamin Leblanc , Pascal Germain , Amaury Habrard , Christine Largeron , Guillaume Metzler , Emilie Morvant , Paul Viallard

Estimation of instrument and noise parameters for inverse problem based on prior diffusion model

This article addresses the issue of estimating observation parameters (response and error parameters) in inverse problems. The focus is on cases where regularization is introduced in a Bayesian framework and the prior is modeled by a…

机器学习 · 统计学 2026-02-13 Jean-François Giovannelli

Provable Offline Reinforcement Learning for Structured Cyclic MDPs

We introduce a novel cyclic Markov decision process (MDP) framework for multi-step decision problems with heterogeneous stage-specific dynamics, transitions, and discount factors across the cycle. In this setting, offline learning is…

机器学习 · 统计学 2026-02-13 Kyungbok Lee , Angelica Cristello Sarteau , Michael R. Kosorok

The Cost of Learning under Multiple Change Points

We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point…

机器学习 · 统计学 2026-02-13 Tomer Gafni , Garud Iyengar , Assaf Zeevi

Empirical Likelihood-Based Fairness Auditing: Distribution-Free Certification and Flagging

Machine learning models in high-stakes applications, such as recidivism prediction and automated personnel selection, often exhibit systematic performance disparities across sensitive subpopulations, raising critical concerns regarding…

机器学习 · 统计学 2026-02-13 Jie Tang , Chuanlong Xie , Xianli Zeng , Lixing Zhu

Distributional Computational Graphs: Error Bounds

We study a general framework of distributional computational graphs: computational graphs whose inputs are probability distributions rather than point values. We analyze the discretization error that arises when these graphs are evaluated…

机器学习 · 统计学 2026-02-13 Olof Hallqvist Elias , Michael Selby , Phillip Stanley-Marbell

Labels or Preferences? Budget-Constrained Learning with Human Judgments over AI-Generated Outputs

The increasing reliance on human preference feedback to judge AI-generated pseudo labels has created a pressing need for principled, budget-conscious data acquisition strategies. We address the crucial question of how to optimally allocate…

机器学习 · 统计学 2026-02-13 Zihan Dong , Xiaotian Hou , Ruijia Wu , Linjun Zhang

Self-Concordant Perturbations for Linear Bandits

We consider the adversarial linear bandits setting and present a unified algorithmic framework that bridges Follow-the-Regularized-Leader (FTRL) and Follow-the-Perturbed-Leader (FTPL) methods, extending the known connection between them…

机器学习 · 统计学 2026-02-13 Lucas Lévy , Jean-Lou Valeau , Arya Akhavan , Patrick Rebeschini

Preventing Model Collapse Under Overparametrization: Optimal Mixing Ratios for Interpolation Learning and Ridge Regression

Model collapse occurs when generative models degrade after repeatedly training on their own synthetic outputs. We study this effect in overparameterized linear regression in a setting where each iteration mixes fresh real labels with…

机器学习 · 统计学 2026-02-13 Anvit Garg , Sohom Bhattacharya , Pragya Sur