机器学习 — Scifaro

Backward Conformal Prediction

We introduce $\textit{Backward Conformal Prediction}$, a method that guarantees conformal coverage while providing flexible control over the size of prediction sets. Unlike standard conformal prediction, which fixes the coverage level and…

机器学习 · 统计学 2026-02-13 Etienne Gauthier , Francis Bach , Michael I. Jordan

Minimax Optimal Estimation of Stability Under Distribution Shift

The performance of decision policies and prediction models often deteriorates when applied to environments different from the ones seen during training. To ensure reliable operation, we analyze the stability of a system under distribution…

机器学习 · 统计学 2026-02-13 Hongseok Namkoong , Yuanzhe Ma , Peter W. Glynn

A Gibbs posterior sampler for inverse problem based on prior diffusion model

This paper addresses the issue of inversion in cases where (1) the observation system is modeled by a linear transformation and additive noise, (2) the problem is ill-posed and regularization is introduced in a Bayesian framework by an a…

机器学习 · 统计学 2026-02-12 Jean-François Giovannelli

Optimal Initialization in Depth: Lyapunov Initialization and Limit Theorems for Deep Leaky ReLU Networks

The development of effective initialization methods requires an understanding of random neural networks. In this work, a rigorous probabilistic analysis of deep unbiased Leaky ReLU networks is provided. We prove a Law of Large Numbers and a…

机器学习 · 统计学 2026-02-12 Constantin Kogler , Tassilo Schwarz , Samuel Kittle

Deep Learning of Compositional Targets with Hierarchical Spectral Methods

Why depth yields a genuine computational advantage over shallow methods remains a central open question in learning theory. We study this question in a controlled high-dimensional Gaussian setting, focusing on compositional target…

机器学习 · 统计学 2026-02-12 Hugo Tabanelli , Yatin Dandi , Luca Pesce , Florent Krzakala

Convergence Rates for Distribution Matching with Sliced Optimal Transport

We study the slice-matching scheme, an efficient iterative method for distribution matching based on sliced optimal transport. We investigate convergence to the target distribution and derive quantitative non-asymptotic rates. To this end,…

机器学习 · 统计学 2026-02-12 Gauthier Thurin , Claire Boyer , Kimia Nadjahi

A solvable high-dimensional model where nonlinear autoencoders learn structure invisible to PCA while test loss misaligns with generalization

Many real-world datasets contain hidden structure that cannot be detected by simple linear correlations between input features. For example, latent factors may influence the data in a coordinated way, even though their effect is invisible…

机器学习 · 统计学 2026-02-12 Vicente Conde Mendes , Lorenzo Bardone , Cédric Koller , Jorge Medina Moreira , Vittorio Erba , Emanuele Troiani , Lenka Zdeborová

Beyond Kemeny Medians: Consensus Ranking Distributions Definition, Properties and Statistical Learning

In this article we develop a new method for summarizing a ranking distribution, \textit{i.e.} a probability distribution on the symmetric group $\mathfrak{S}_n$, beyond the classical theory of consensus and Kemeny medians. Based on the…

机器学习 · 统计学 2026-02-12 Stephan Clémençon , Ekhine Irurozki

Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood

Policy inference plays an essential role in the contextual bandit problem. In this paper, we use empirical likelihood to develop a Bayesian inference method for the joint analysis of multiple contextual bandit policies in finite sample…

机器学习 · 统计学 2026-02-12 Jiangrong Ouyang , Mingming Gong , Howard Bondell

Deep Bootstrap

In this work, we propose a novel deep bootstrap framework for nonparametric regression based on conditional diffusion models. Specifically, we construct a conditional diffusion model to learn the distribution of the response variable given…

机器学习 · 统计学 2026-02-12 Jinyuan Chang , Yuling Jiao , Lican Kang , Junjie Shi

Statistical Inference and Learning for Shapley Additive Explanations (SHAP)

The SHAP (short for Shapley additive explanation) framework has become an essential tool for attributing importance to variables in predictive tasks. In model-agnostic settings, SHAP uses the concept of Shapley values from cooperative game…

机器学习 · 统计学 2026-02-12 Justin Whitehouse , Ayush Sawarni , Vasilis Syrgkanis

Generalized Robust Adaptive-Bandwidth Multi-View Manifold Learning in High Dimensions with Noise

Multiview datasets are common in scientific and engineering applications, yet existing fusion methods offer limited theoretical guarantees, particularly in the presence of heterogeneous and high-dimensional noise. We propose Generalized…

机器学习 · 统计学 2026-02-12 Xiucai Ding , Chao Shen , Hau-Tieng Wu

Dissecting Performative Prediction: A Comprehensive Survey

The field of performative prediction had its beginnings in 2020 with the seminal paper "Performative Prediction" by Perdomo et al., which established a novel machine learning setup where the deployment of a predictive model causes a…

机器学习 · 统计学 2026-02-12 Thomas Kehrenberg , Javier Sanguino , Jose A. Lozano , Novi Quadrianto

Efficient Causal Structure Learning via Modular Subgraph Integration

Learning causal structures from observational data remains a fundamental yet computationally intensive task, particularly in high-dimensional settings where existing methods face challenges such as the super-exponential growth of the search…

机器学习 · 统计学 2026-02-12 Haixiang Sun , Pengchao Tian , Zihan Zhou , Jielei Zhang , Peiyi Li , Andrew L. Liu

Conformal Prediction for Compositional Data

Dirichlet regression models are suitable for compositional data, in which the response variable represents proportions that sum to one. However, there are still no well-established methods for constructing valid prediction sets in this…

机器学习 · 统计学 2026-02-12 Lucas P. Amaral , Luben M. C. Cabezas , Thiago R. Ramos , Gustavo H. G. A. Pereira

Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements

This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner. Our proposed method, Liquid-Graph-Time Clustering-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by…

机器学习 · 统计学 2026-02-12 Antonio Marino , Esteban Restrepo , Claudio Pacchierotti , Paolo Robuffo Giordano

Multi-Objective Bayesian Optimization for Networked Black-Box Systems: A Path to Greener Profits and Smarter Designs

Designing modern industrial systems requires balancing several competing objectives, such as profitability, resilience, and sustainability, while accounting for complex interactions between technological, economic, and environmental…

机器学习 · 统计学 2026-02-12 Akshay Kudva , Wei-Ting Tang , Joel A. Paulson

Tensor learning with orthogonal, Lorentz, and symplectic symmetries

Tensors are a fundamental data structure for many scientific contexts, such as time series analysis, materials science, and physics, among many others. Improving our ability to produce and handle tensors is essential to efficiently address…

机器学习 · 统计学 2026-02-12 Wilson G. Gregory , Josué Tonelli-Cueto , Nicholas F. Marshall , Andrew S. Lee , Soledad Villar

Diffusion posterior sampling for simulation-based inference in tall data settings

Identifying the parameters of a non-linear model that best explain observed data is a core task across scientific fields. When such models rely on complex simulators, evaluating the likelihood is typically intractable, making traditional…

机器学习 · 统计学 2026-02-12 Julia Linhart , Gabriel Victorino Cardoso , Alexandre Gramfort , Sylvain Le Corff , Pedro L. C. Rodrigues

Fine-grained Analysis of Non-parametric Estimation for Pairwise Learning

In this paper, we are concerned with the generalization performance of non-parametric estimation for pairwise learning. Most of the existing work requires the hypothesis space to be convex or a VC-class, and the loss to be convex. However,…

机器学习 · 统计学 2026-02-12 Junyu Zhou , Shuo Huang , Han Feng , Puyu Wang , Ding-Xuan Zhou