机器学习 — Scifaro

Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification

Bayesian predictive inference propagates parameter uncertainty to quantities of interest through the posterior-predictive distribution. In practice, this is typically performed using a two-stage procedure: first approximating the posterior…

机器学习 · 统计学 2026-05-06 Nan Feng , Xun Huan

Free Decompression with Algebraic Spectral Curves

Tools from random matrix theory have become central to deep learning theory, using spectral information to provide mechanisms for modeling generalization, robustness, scaling, and failure modes. While often capable of modeling empirical…

机器学习 · 统计学 2026-05-06 Siavash Ameli , Chris van der Heide , Liam Hodgkinson , Michael W. Mahoney

Adaptive Estimation and Optimal Control in Offline Contextual MDPs without Stationarity

Contextual MDPs are powerful tools with wide applicability in areas from biostatistics to machine learning. However, specializing them to offline datasets has been challenging due to a lack of robust, theoretically backed methods. Our work…

机器学习 · 统计学 2026-05-06 Riddhiman Bhattacharyya , Sayak Chakrabarty , Imon Banerjee

Imbalanced Classification under Capacity Constraints

In many classification settings, the class of primary interest is underrepresented, leading to imbalanced data problems that arise in applications such as rare disease detection and fraud identification. In these contexts, identifying a…

机器学习 · 统计学 2026-05-06 Daniel Fraiman , Ricardo Fraiman

On the Spectral Structure and Objective Equivalence of Orthogonal Multilabel Fisher Discriminants

We provide a unified theoretical analysis of Linear Discriminant Analysis with simultaneous multilabel scatter matrix formulations and Stiefel orthogonality constraints. Our contributions span both algebraic structure and statistical…

机器学习 · 统计学 2026-05-06 Brian Keith-Norambuena , Juan Bekios-Calfa

Partial Effective Information Decomposition for Synergistic Causality

Causality is a central topic in scientific inquiry, yet for complex systems, the identification and analysis of synergistic causation remain a challenging and fundamental problem. In the context of causal relations among multivariate…

机器学习 · 统计学 2026-05-06 Mingzhe Yang , Shuo Wang , Jiang Zhang

Intrinsic effective sample size for manifold-valued Markov chain Monte Carlo via kernel discrepancy

Effective sample size is a standard summary of Markov chain Monte Carlo output, but it is usually attached to scalar or Euclidean summaries chosen by the analyst. For manifold-valued samples this choice is not canonical: coordinate-wise…

机器学习 · 统计学 2026-05-06 Kisung You

Conformalized Percentile Interval: Finite Sample Validity and Improved Conditional Performance

Conformal prediction provides distribution-free predictive intervals with finite-sample marginal coverage. However, achieving conditional validity and interval efficiency (in terms of short interval length) remains challenging, particularly…

机器学习 · 统计学 2026-05-06 Ran Zou , Wanrong Zhu , Bin Nan

Highly Adaptive Principal Component Regression

The Highly Adaptive Lasso (HAL) is a nonparametric regression method that achieves almost dimension-free convergence rates under minimal smoothness assumptions, but its implementation can be computationally prohibitive in high dimensions…

机器学习 · 统计学 2026-05-06 Mingxun Wang , Alejandro Schuler , Mark van der Laan , Carlos García Meixide

Optimal control of the future via prospective learning with control

Optimal control of the future is the next frontier for AI. Current approaches to this problem are typically rooted in reinforcement learning (RL). RL is mathematically distinct from supervised learning, which has been the main workhorse for…

机器学习 · 统计学 2026-05-06 Yuxin Bai , Aranyak Acharyya , Ashwin De Silva , Zeyu Shen , James Hassett , Joshua T. Vogelstein

Efficient Deconvolution in Populational Inverse Problems

This work is focussed on the inversion task of inferring the distribution over parameters of interest leading to multiple sets of observations. The potential to solve such distributional inversion problems is driven by increasing…

机器学习 · 统计学 2026-05-06 Arnaud Vadeboncoeur , Mark Girolami , Andrew M. Stuart

Random-Effects Algorithm for Random Objects in Metric Spaces

Across many scientific disciplines, multiple observations are collected from the same experimental units, and in modern datasets these observations often arise as non-Euclidean random objects. In such settings, the incorporation of random…

机器学习 · 统计学 2026-05-05 Marcos Matabuena , Mateo Cámara

ParaRNN: An Interpretable and Parallelizable Recurrent Neural Network for Time-Dependent Data

The proliferation of large-scale and structurally complex data has spurred the integration of machine learning methods into statistical modeling. Recurrent neural networks (RNNs), a foundational class of models for time-dependent data, can…

机器学习 · 统计学 2026-05-05 Yuxi Cai , Lan Li , Feiqing Huang , Guodong Li

Online Generalised Predictive Coding

This paper introduces an extension of generalised filtering for online applications. Generalised filtering refers to data assimilation schemes that jointly infer latent states, learn unknown model parameters, and estimate uncertainty in an…

机器学习 · 统计学 2026-05-05 Mehran H. Z. Bazargani , Szymon Urbas , Adeel Razi , Thomas Brendan Murphy , Karl Friston

Black-box optimization of noisy functions with unknown smoothness

We study the problem of black-box optimization of a function f of any dimension, given function evaluations perturbed by noise. The function is assumed to be locally smooth around one of its global optima, but this smoothness is unknown.…

机器学习 · 统计学 2026-05-05 Jean-Bastien Grill , Michal Valko , Rémi Munos

Middle-mile logistics through the lens of goal-conditioned reinforcement learning

Middle-mile logistics describes the problem of routing parcels through a network of hubs linked by trucks with finite capacity. We rephrase this as a multi-object goal-conditioned MDP. Our method combines graph neural networks with…

机器学习 · 统计学 2026-05-05 Onno Eberhard , Thibaut Cuvelier , Michal Valko , Bruno De Backer

Active multiple matrix completion with adaptive confidence sets

In this work, we formulate a new multi-task active learning setting in which the learner's goal is to solve multiple matrix completion problems simultaneously. At each round, the learner can choose from which matrix it receives a sample…

机器学习 · 统计学 2026-05-05 Andrea Locatelli , Alexandra Carpentier , Michal Valko

Measuring Differences between Conditional Distributions using Kernel Embeddings

Comparing conditional distributions is a fundamental challenge in statistics and machine learning, with applications across a wide range of domains. While proposed methods for measuring discrepancies using kernel embeddings of distributions…

机器学习 · 统计学 2026-05-05 Peter Moskvichev , Siu Lun Chau , Dino Sejdinovic

The Causal Description Gap: Information-Theoretic Separations Across Pearl's Hierarchy

Pearl's causal hierarchy shows that observational, interventional, and counterfactual queries are qualitatively distinct. We ask a quantitative version of this question: how many additional bits are needed to specify higher-rung causal…

机器学习 · 统计学 2026-05-05 Seyed Morteza Emadi

MIRA: A Score for Conditional Distribution Accuracy and Model Comparison

We introduce Mira, a sample-based score for assessing the accuracy of a candidate conditional distribution using only joint samples from the true data-generating process. Relying on the principle that distributions coincide if they assign…

机器学习 · 统计学 2026-05-05 Sammy Sharief , Justine Zeghal , Gabriel Missael Barco , Pablo Lemos , Yashar Hezaveh , Laurence Perreault-Levasseur