机器学习 — Scifaro

On the Hardness of Reinforcement Learning with Transition Look-Ahead

We study reinforcement learning (RL) with transition look-ahead, where the agent may observe which states would be visited upon playing any sequence of $\ell$ actions before deciding its course of action. While such predictive information…

机器学习 · 统计学 2026-03-31 Corentin Pla , Hugo Richard , Marc Abeille , Nadav Merlis , Vianney Perchet

The Minimax Lower Bound of Kernel Stein Discrepancy Estimation

Kernel Stein discrepancies (KSDs) have emerged as a powerful tool for quantifying goodness-of-fit over the last decade, featuring numerous successful applications. To the best of our knowledge, all existing KSD estimators with known rate…

机器学习 · 统计学 2026-03-31 Jose Cribeiro-Ramallo , Agnideep Aich , Florian Kalinke , Ashit Baran Aich , Zoltán Szabó

On some practical challenges of conformal prediction

Conformal prediction is a model-free machine learning method for constructing prediction regions at a guaranteed coverage probability level. However, a data scientist often faces three challenges in practice: (i) the determination of a…

机器学习 · 统计学 2026-03-31 Liang Hong , Noura Raydan Nasreddine

Projection-based multifidelity linear regression for data-scarce applications

Surrogate modeling for systems with high-dimensional quantities of interest remains challenging, particularly when training data are costly to acquire. This work develops multifidelity methods for multiple-input multiple-output linear…

机器学习 · 统计学 2026-03-31 Vignesh Sella , Julie Pham , Karen Willcox , Anirban Chaudhuri

Flow IV: Counterfactual Inference In Nonseparable Outcome Models Using Instrumental Variables

To reach human level intelligence, learning algorithms need to incorporate causal reasoning. But identifying causality, and particularly counterfactual reasoning, remains elusive. In this paper, we make progress on counterfactual inference…

机器学习 · 统计学 2026-03-31 Marc Braun , Jose M. Peña , Adel Daoud

Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation

Using the bit string generation problem as a case study, we theoretically compare two standard methods for adapting large language models to new tasks. The first, referred to as supervised fine-tuning, involves training a new next token…

机器学习 · 统计学 2026-03-31 Seamus Somerstep , Vinod Raman , Unique Subedi , Yuekai Sun

Diffusion Models with Double Guidance: Generate with aggregated datasets

Creating large-scale datasets for training high-performance generative models is often prohibitively expensive, especially when associated attributes or annotations must be provided. As a result, merging existing datasets has become a…

机器学习 · 统计学 2026-03-31 Yanfeng Yang , Kenji Fukumizu

Stacked conformal prediction

We consider a method for conformalizing a stacked ensemble of predictive models, showing that the potentially simple form of the meta-learner at the top of the stack enables a procedure with manageable computational cost that achieves…

机器学习 · 统计学 2026-03-31 Paulo C. Marques F

Training Latent Diffusion Models with Interacting Particle Algorithms

We introduce a novel particle-based algorithm for end-to-end training of latent diffusion models. We reformulate the training task as minimizing a free energy functional and obtain a gradient flow that does so. By approximating the latter…

机器学习 · 统计学 2026-03-31 Tim Y. J. Wang , Juan Kuntz , O. Deniz Akyildiz

Efficient Human-in-the-Loop Active Learning: A Novel Framework for Data Labeling in AI Systems

Modern AI algorithms require labeled data. In real world, majority of data are unlabeled. Labeling the data are costly. this is particularly true for some areas requiring special skills, such as reading radiology images by physicians. To…

机器学习 · 统计学 2026-03-31 Yiran Huang , Jian-Feng Yang , Haoda Fu

Trans-Glasso: A Transfer Learning Approach to Precision Matrix Estimation

Precision matrix estimation is essential in various fields; yet it is challenging when samples for the target study are limited. Transfer learning can enhance estimation accuracy by leveraging data from related source studies. We propose…

机器学习 · 统计学 2026-03-31 Boxin Zhao , Cong Ma , Mladen Kolar

Predictive variational inference: Learn the predictively optimal posterior distribution

Vanilla variational inference finds an optimal approximation to the Bayesian posterior distribution, but even the exact Bayesian posterior is often not meaningful under model misspecification. We propose predictive variational inference…

机器学习 · 统计学 2026-03-31 Jinlin Lai , Antonio Linero , Yuling Yao

Kantorovich--Kernel Neural Operators: Approximation Theory, Asymptotics, and Neural Network Interpretation

This paper studies a class of multivariate Kantorovich-kernel neural network operators, including the deep Kantorovich-type neural network operators studied by Sharma and Singh. We prove density results, establish quantitative convergence…

机器学习 · 统计学 2026-03-30 Tian-Xiao He

Generative Score Inference for Multimodal Data

Accurate uncertainty quantification is crucial for making reliable decisions in various supervised learning scenarios, particularly when dealing with complex, multimodal data such as images and text. Current approaches often face notable…

机器学习 · 统计学 2026-03-30 Xinyu Tian , Xiaotong Shen

A Power-Weighted Noncentral Complex Gaussian Distribution

The complex Gaussian distribution has been widely used as a fundamental spectral and noise model in signal processing and communication. However, its Gaussian structure often limits its ability to represent the diverse amplitude…

机器学习 · 统计学 2026-03-30 Toru Nakashika

Asymptotic Optimism for Tensor Regression Models with Applications to Neural Network Compression

We study rank selection for low-rank tensor regression under random covariates design. Under a Gaussian random-design model and some mild conditions, we derive population expressions for the expected training-testing discrepancy (optimism)…

机器学习 · 统计学 2026-03-30 Haoming Shi , Eric C. Chi , Hengrui Luo

Beyond identifiability: Learning causal representations with few environments and finite samples

We provide explicit, finite-sample guarantees for learning causal representations from data with a sublinear number of environments. Causal representation learning seeks to provide a rigourous foundation for the general representation…

机器学习 · 统计学 2026-03-30 Inbeom Lee , Tongtong Jin , Bryon Aragam

SAHMM-VAE: A Source-Wise Adaptive Hidden Markov Prior Variational Autoencoder for Unsupervised Blind Source Separation

We propose SAHMM-VAE, a source-wise adaptive Hidden Markov prior variational autoencoder for unsupervised blind source separation. Instead of treating the latent prior as a single generic regularizer, the proposed framework assigns each…

机器学习 · 统计学 2026-03-30 Yuan-Hao Wei

Conformal Graph Prediction with Z-Gromov Wasserstein Distances

Supervised graph prediction addresses regression problems where the outputs are structured graphs. Although several approaches exist for graph-valued prediction, principled uncertainty quantification remains limited. We propose a conformal…

机器学习 · 统计学 2026-03-30 Gabriel Melo , Thibaut de Saivre , Anna Calissano , Florence d'Alché-Buc

Generative modeling of conditional probability distributions on the level-sets of collective variables

Given a probability distribution $\mu$ in $\mathbb{R}^d$ represented by data, we study in this paper the generative modeling of the corresponding conditional probability distributions on the level-sets of a collective variable…

机器学习 · 统计学 2026-03-30 Fatima-Zahrae Akhyar , Wei Zhang , Gabriel Stoltz , Christof Schütte