机器学习 — Scifaro

Oracle-Efficient Combinatorial Semi-Bandits

We study the combinatorial semi-bandit problem where an agent selects a subset of base arms and receives individual feedback. While this generalizes the classical multi-armed bandit and has broad applicability, its scalability is limited by…

机器学习 · 统计学 2025-10-27 Jung-hun Kim , Milan Vojnović , Min-hwan Oh

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in…

机器学习 · 统计学 2025-10-27 Antônio H. Ribeiro , David Vävinggren , Dave Zachariah , Thomas B. Schön , Francis Bach

Exponential Convergence Guarantees for Iterative Markovian Fitting

The Schr\"odinger Bridge (SB) problem has become a fundamental tool in computational optimal transport and generative modeling. To address this problem, ideal methods such as Iterative Proportional Fitting and Iterative Markovian Fitting…

机器学习 · 统计学 2025-10-27 Marta Gentiloni Silveri , Giovanni Conforti , Alain Durmus

Learning Survival Models with Right-Censored Reporting Delays

Survival analysis is a statistical technique used to estimate the time until an event occurs. Although it is applied across a wide range of fields, adjusting for reporting delays under practical constraints remains a significant challenge…

机器学习 · 统计学 2025-10-27 Yuta Shikuri , Hironori Fujisawa

RAPTOR-GEN: RApid PosTeriOR GENerator for Bayesian Learning in Biomanufacturing

Biopharmaceutical manufacturing is vital to public health but lacks the agility for rapid, on-demand production of biotherapeutics due to the complexity and variability of bioprocesses. To overcome this, we introduce RApid PosTeriOR…

机器学习 · 统计学 2025-10-27 Wandi Xu , Wei Xie

Posterior Contraction for Sparse Neural Networks in Besov Spaces with Intrinsic Dimensionality

This work establishes that sparse Bayesian neural networks achieve optimal posterior contraction rates over anisotropic Besov spaces and their hierarchical compositions. These structures reflect the intrinsic dimensionality of the…

机器学习 · 统计学 2025-10-27 Kyeongwon Lee , Lizhen Lin , Jaewoo Park , Seonghyun Jeong

Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness

Disaggregated evaluation across subgroups is critical for assessing the fairness of machine learning models, but its uncritical use can mislead practitioners. We show that equal performance across subgroups is an unreliable measure of…

机器学习 · 统计学 2025-10-27 Stephen R. Pfohl , Natalie Harris , Chirag Nagpal , David Madras , Vishwali Mhasawade , Olawale Salaudeen , Awa Dieng , Shannon Sequeira , Santiago Arciniegas , Lillian Sung , Nnamdi Ezeanochie , Heather Cole-Lewis , Katherine Heller , Sanmi Koyejo , Alexander D'Amour

STACI: Spatio-Temporal Aleatoric Conformal Inference

Fitting Gaussian Processes (GPs) provides interpretable aleatoric uncertainty quantification for estimation of spatio-temporal fields. Spatio-temporal deep learning models, while scalable, typically assume a simplistic independent…

机器学习 · 统计学 2025-10-27 Brandon R. Feng , David Keetae Park , Xihaier Luo , Arantxa Urdangarin , Shinjae Yoo , Brian J. Reich

Lorentz Local Canonicalization: How to Make Any Network Lorentz-Equivariant

Lorentz-equivariant neural networks are becoming the leading architectures for high-energy physics. Current implementations rely on specialized layers, limiting architectural choices. We introduce Lorentz Local Canonicalization (LLoCa), a…

机器学习 · 统计学 2025-10-27 Jonas Spinner , Luigi Favaro , Peter Lippmann , Sebastian Pitz , Gerrit Gerhartz , Tilman Plehn , Fred A. Hamprecht

Anytime-valid, Bayes-assisted, Prediction-Powered Inference

Given a large pool of unlabelled data and a smaller amount of labels, prediction-powered inference (PPI) leverages machine learning predictions to increase the statistical efficiency of confidence interval procedures based solely on…

机器学习 · 统计学 2025-10-27 Valentin Kilian , Stefano Cortinovis , François Caron

An Efficient Orlicz-Sobolev Approach for Transporting Unbalanced Measures on a Graph

We investigate optimal transport (OT) for measures on graph metric spaces with different total masses. To mitigate the limitations of traditional $L^p$ geometry, Orlicz-Wasserstein (OW) and generalized Sobolev transport (GST) employ Orlicz…

机器学习 · 统计学 2025-10-27 Tam Le , Truyen Nguyen , Hideitsu Hino , Kenji Fukumizu

Temperature Optimization for Bayesian Deep Learning

The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where tempering the posterior to a cold temperature often improves the predictive performance of the posterior predictive distribution (PPD). Although the term…

机器学习 · 统计学 2025-10-27 Kenyon Ng , Chris van der Heide , Liam Hodgkinson , Susan Wei

Efficient Fairness-Performance Pareto Front Computation

There is a well known intrinsic trade-off between the fairness of a representation and the performance of classifiers derived from the representation. Due to the complexity of optimisation algorithms in most modern representation learning…

机器学习 · 统计学 2025-10-27 Mark Kozdoba , Binyamin Perets , Shie Mannor

Factor Fitting, Rank Allocation, and Partitioning in Multilevel Low Rank Matrices

We consider multilevel low rank (MLR) matrices, defined as a row and column permutation of a sum of matrices, each one a block diagonal refinement of the previous one, with all blocks low rank given in factored form. MLR matrices extend low…

机器学习 · 统计学 2025-10-27 Tetiana Parshakova , Trevor Hastie , Eric Darve , Stephen Boyd

Regret Distribution in Stochastic Bandits: Optimal Trade-off between Expectation and Tail Risk

We study the optimal trade-off between expectation and tail risk for regret distribution in the stochastic multi-armed bandit model. We fully characterize the interplay among three desired properties for policy design: worst-case…

机器学习 · 统计学 2025-10-27 David Simchi-Levi , Zeyu Zheng , Feng Zhu

Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection

As Large Language Models (LLMs) continue to evolve, practitioners face increasing options for enhancing inference-time performance without model retraining, including budget tuning and multi-step techniques like self-reflection. While these…

机器学习 · 统计学 2025-10-24 Jack Butler , Nikita Kozodoi , Zainab Afolabi , Brian Tyacke , Gaiar Baimuratov

Diffusion Autoencoders with Perceivers for Long, Irregular and Multimodal Astronomical Sequences

Self-supervised learning has become a central strategy for representation learning, but the majority of architectures used for encoding data have only been validated on regularly-sampled inputs such as images, audios. and videos. In many…

机器学习 · 统计学 2025-10-24 Yunyi Shen , Alexander Gagliano

Concentration and excess risk bounds for imbalanced classification with synthetic oversampling

Synthetic oversampling of minority examples using SMOTE and its variants is a leading strategy for addressing imbalanced classification problems. Despite the success of this approach in practice, its theoretical foundations remain…

机器学习 · 统计学 2025-10-24 Touqeer Ahmad , Mohammadreza M. Kalan , François Portier , Gilles Stupfler

Learning Decentralized Routing Policies via Graph Attention-based Multi-Agent Reinforcement Learning in Lunar Delay-Tolerant Networks

We present a fully decentralized routing framework for multi-robot exploration missions operating under the constraints of a Lunar Delay-Tolerant Network (LDTN). In this setting, autonomous rovers must relay collected data to a lander under…

机器学习 · 统计学 2025-10-24 Federico Lozano-Cuadra , Beatriz Soret , Marc Sanchez Net , Abhishek Cauligi , Federico Rossi

Neural Networks for Censored Expectile Regression Based on Data Augmentation

Expectile regression neural networks (ERNNs) are powerful tools for capturing heterogeneity and complex nonlinear structures in data. However, most existing research has primarily focused on fully observed data, with limited attention paid…

机器学习 · 统计学 2025-10-24 Wei Cao , Shanshan Wang