机器学习 — Scifaro

Deep Optimal Individualized Treatment Rules for Bivariate Survival Outcomes via Adaptive Prediction-Powered Learning

In randomized trials involving multiple treatments, bivariate survival outcomes present significant analytical challenges for making decisions. This paper addresses the problem of deriving optimal individualized treatment rules to maximize…

机器学习 · 统计学 2026-05-29 Kun Ren , Yifan Cui , Wen Su

Prediction-Powered Inference Across Many Tasks for AI Evaluation & Social Science Research

Many applications require statistically valid inference across many related tasks, while using only a handful of high-quality labels per hypothesis. In AI evaluation, these tasks may correspond to model behaviors across prompts, subgroups,…

机器学习 · 统计学 2026-05-29 Nicolas Emmenegger , Ellery Stahler , Chara Podimata

Anytime-Valid Federated Conformal RAG for LLM Swarms

Federated Conformal RAG (FC-RAG) provides distribution-free coverage for a bandwidth-limited swarm of weak language models, but only at a fixed horizon. We extend it to anytime-valid sequential coverage: validity at every stopping time,…

机器学习 · 统计学 2026-05-29 Prasanjit Dubey , Xiaoming Huo

Dynamics of Stochastic Momentum with Sparse Updates in High Dimensions

Existing theory of momentum assumes that gradients arrive at every parameter at a roughly constant rate, an assumption violated in practice by heavy-tailed data distributions and modern architectures. We theoretically analyze the dynamics…

机器学习 · 统计学 2026-05-29 Katie Everett , Elliot Paquette

Bridging Maximum Likelihood and Optimal Transport for Efficient Inference and Model Selection in Stochastic Block Models

We study inference in stochastic block models (SBMs) through the lens of optimal transport (OT). We first establish that maximum likelihood variational inference (MLVI) can be interpreted as a semi-relaxed Gromov-Wasserstein (srGW)…

机器学习 · 统计学 2026-05-29 Simon Queric , Cédric Vincent-Cuaz , Charles Bouveyron , Marco Corneli

Insurance Pricing Optimization via Off-Policy Evaluation

Traditional insurance pricing relies on risk-based principles that ensure actuarial fairness and solvency but do not explicitly account for policyholders' price sensitivity. We formulate insurance pricing as a decision-making problem and…

机器学习 · 统计学 2026-05-29 Sascha Günther , Dimitri Semenovich , Mario V. Wüthrich

Triangular-Reference Schr\"odinger Bridges for Time Series Generation

We introduce Triangular-Reference Schr\"odinger Bridges for Time Series (TR-SBTS), a conservative extension of the SBTS framework in which the Brownian reference is replaced by an intervalwise frozen, possibly degenerate diffusion…

机器学习 · 统计学 2026-05-29 Gabriele Bocchi

Stop Suppressing the Tail: Causal Inference for Extreme Events

Estimating how an outcome responds to a continuous treatment (the Average Dose-Response Function, or ADRF) is a core causal-inference primitive. However, when outcomes possess heavy tails, standard robust double machine learning (DML)…

机器学习 · 统计学 2026-05-29 Eichi Uehara

MEDAL: Manifold Embedding Distillation via Autoencoder Learning

Low-dimensional embeddings are widely used as visual summaries of high-dimensional data and to enable downstream scientific discoveries. Yet, popular nonlinear dimension reduction methods, such as t-SNE and UMAP, are often selected based on…

机器学习 · 统计学 2026-05-29 Irene Chang , Tarek M. Zikry , Genevera I. Allen

Online Learning-to-Defer with Varying Experts

Learning-to-Defer (L2D) methods route each query either to a predictive model or to external experts. While existing work studies this problem in batch settings, real-world deployments require handling streaming data, changing expert…

机器学习 · 统计学 2026-05-29 Dang Hoang Duy , Yannis Montreuil , Maxime Meyer , Axel Carlier , Lai Xing Ng , Wei Tsang Ooi

Self-Supervised Laplace Approximation for Bayesian Uncertainty Quantification

Approximate Bayesian inference typically revolves around computing the posterior parameter distribution. In practice, however, the main object of interest is often a model's predictions rather than its parameters. In this work, we propose…

机器学习 · 统计学 2026-05-29 Julian Rodemann , Alexander Marquard , Thomas Augustin , Michele Caprio

A Refined Generalization Analysis for Extreme Multi-class Supervised Contrastive Representation Learning

Contrastive Representation Learning (CRL) has achieved strong empirical success in multiple machine learning disciplines, yet its theoretical sample complexity remains poorly understood. Existing analyses usually assume that input tuples…

机器学习 · 统计学 2026-05-29 Nong Minh Hieu , Antoine Ledent

Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version

This paper studies continuous-time stochastic control problems whose controlled states are fully non-Markovian and depend on unknown model parameters. Such problems arise naturally in path-dependent stochastic differential equations,…

机器学习 · 统计学 2026-05-29 Dorival Leão , Alberto Ohashi , Simone Scotti , Adolfo M. D da Silva

Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer

A learning-to-defer (L2D) system decides, for each input, whether to predict on its own or to hand it to one of several available experts. The very well established recipe trains classifier and router jointly by treating the $K$ classes and…

机器学习 · 统计学 2026-05-29 Yannis Montreuil , Axel Carlier , Lai Xing Ng , Wei Tsang Ooi

MEC: Machine-Learning-Assisted Generalized Entropy Calibration for Semi-Supervised Mean Estimation

Obtaining high-quality labels is costly, whereas unlabeled covariates are often abundant, motivating semi-supervised inference methods with reliable uncertainty quantification. Prediction-powered inference (PPI) leverages a machine-learning…

机器学习 · 统计学 2026-05-29 Se Yoon Lee , Jae Kwang Kim

Measure flow path recovery in Bayes Hilbert spaces

We study the ill-posed problem of recovering a probability measure flow from finitely many moving localized sensors using a Bayes Hilbert framework. Relative to a fixed reference probability measure, a probability law is represented by its…

机器学习 · 统计学 2026-05-29 S. David Mis , Maarten V. de Hoop

Learning-to-Defer with Expert-Conditional Advice

Learning-to-Defer routes each input to the expert that minimizes expected cost, but it assumes that the information available to every expert is fixed at decision time. Many modern systems violate this assumption: after selecting an expert,…

机器学习 · 统计学 2026-05-29 Yannis Montreuil , Leïna Montreuil , Axel Carlier , Lai Xing Ng , Wei Tsang Ooi

Aggregate Models, Not Explanations: Improving Feature Importance Estimation

Feature-importance methods show promise in transforming machine learning models from predictive engines into tools for scientific discovery. However, due to data sampling and algorithmic stochasticity, expressive models can be unstable,…

机器学习 · 统计学 2026-05-29 Joseph Paillard , Angel Reyero Lobo , Denis A. Engemann , Bertrand Thirion

Diffusion differentiable resampling

This paper is concerned with differentiable resampling in the context of sequential Monte Carlo (e.g., particle filtering). Drawing on reparametrisation, we propose a new resampling method that is informative and instantly differentiable,…

机器学习 · 统计学 2026-05-29 Jennifer Rosina Andersson , Zheng Zhao

BITS for GAPS: Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates

We introduce Bayesian Information-Theoretic Sampling for hierarchical GAussian Process Surrogates (BITS for GAPS), a framework enabling information-theoretic experimental design of Gaussian process-based surrogate models. Unlike standard…

机器学习 · 统计学 2026-05-29 Kyla D. Jones , Alexander W. Dowling