机器学习 — Scifaro

CAOS: Conformal Aggregation of One-Shot Predictors

One-shot prediction enables rapid adaptation of pretrained foundation models to new tasks using only one labeled example, but lacks principled uncertainty quantification. While conformal prediction provides finite-sample coverage…

机器学习 · 统计学 2026-02-02 Maja Waldron

Physics-Informed Neural Networks and Neural Operators for Parametric PDEs

PDEs arise ubiquitously in science and engineering, where solutions depend on parameters (physical properties, boundary conditions, geometry). Traditional numerical methods require re-solving the PDE for each parameter, making parameter…

机器学习 · 统计学 2026-02-02 Zhuo Zhang , Xiong Xiong , Sen Zhang , Yuan Zhao , Xi Yang

Calibrating Decision Robustness via Inverse Conformal Risk Control

Robust optimization safeguards decisions against uncertainty by optimizing against worst-case scenarios, yet their effectiveness hinges on a prespecified robustness level that is often chosen ad hoc, leading to either insufficient…

机器学习 · 统计学 2026-02-02 Wenbin Zhou , Shixiang Zhu

Generalization Dynamics of Linear Diffusion Models

Diffusion models are powerful generative models that produce high-quality samples from complex data. While their infinite-data behavior is well understood, their generalization with finite data remains less clear. Classical learning theory…

机器学习 · 统计学 2026-02-02 Claudia Merger , Sebastian Goldt

Stein's method for marginals on large graphical models

Many spatial models exhibit locality structures that effectively reduce their intrinsic dimensionality, enabling efficient approximation and sampling of high-dimensional distributions. However, existing approximation techniques primarily…

机器学习 · 统计学 2026-02-02 Tiangang Cui , Shuigen Liu , Xin T. Tong

Multivariate Bayesian Last Layer for Regression with Uncertainty Quantification and Decomposition

We present new Bayesian Last Layer neural network models in the setting of multivariate regression under heteroscedastic noise, and propose EM algorithms for parameter learning. Bayesian modeling of a neural network's final layer has the…

机器学习 · 统计学 2026-02-02 Han Wang , Eiji Kawasaki , Guillaume Damblin , Geoffrey Daniel

Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation

Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models. In this paper, we propose a novel VI method that extends the naive mean field via entropic regularization, referred…

机器学习 · 统计学 2026-02-02 Bohan Wu , David Blei

A VAE Approach to Sample Multivariate Extremes

Generating accurate extremes from an observational data set is crucial when seeking to estimate risks associated with the occurrence of future extremes which could be larger than those already observed. Applications range from the…

机器学习 · 统计学 2026-02-02 Nicolas Lafon , Philippe Naveau , Ronan Fablet

Three approaches to supervised learning for compositional data with pairwise logratios

The common approach to compositional data analysis is to transform the data by means of logratios. Logratios between pairs of compositional parts (pairwise logratios) are the easiest to interpret in many research problems. When the number…

机器学习 · 统计学 2026-02-02 Germa Coenders , Michael Greenacre

Efficient Stochastic Optimisation via Sequential Monte Carlo

The problem of optimising functions with intractable gradients frequently arise in machine learning and statistics, ranging from maximum marginal likelihood estimation procedures to fine-tuning of generative models. Stochastic approximation…

机器学习 · 统计学 2026-01-30 James Cuin , Davide Carbone , Yanbo Tang , O. Deniz Akyildiz

Near-Optimal Private Tests for Simple and MLR Hypotheses

We develop a near-optimal testing procedure under the framework of Gaussian differential privacy for simple as well as one- and two-sided tests under monotone likelihood ratio conditions. Our mechanism is based on a private mean estimator…

机器学习 · 统计学 2026-01-30 Yu-Wei Chen , Raghu Pasupathy , Jordan Awan

Clustering in Deep Stochastic Transformers

Transformers have revolutionized deep learning across various domains but understanding the precise token dynamics remains a theoretical challenge. Existing theories of deep Transformers with layer normalization typically predict that…

机器学习 · 统计学 2026-01-30 Lev Fedorov , Michaël E. Sander , Romuald Elie , Pierre Marion , Mathieu Laurière

On Forgetting and Stability of Score-based Generative models

Understanding the stability and long-time behavior of generative models is a fundamental problem in modern machine learning. This paper provides quantitative bounds on the sampling error of score-based generative models by leveraging…

机器学习 · 统计学 2026-01-30 Stanislas Strasman , Gabriel Cardoso , Sylvain Le Corff , Vincent Lemaire , Antonio Ocello

A Judge-Aware Ranking Framework for Evaluating Large Language Models without Ground Truth

Evaluating large language models (LLMs) on open-ended tasks without ground-truth labels is increasingly done via the LLM-as-a-judge paradigm. A critical but under-modeled issue is that judge LLMs differ substantially in reliability;…

机器学习 · 统计学 2026-01-30 Mingyuan Xu , Xinzi Tan , Jiawei Wu , Doudou Zhou

Questioning the Coverage-Length Metric in Conformal Prediction: When Shorter Intervals Are Not Better

Conformal prediction (CP) has become a cornerstone of distribution-free uncertainty quantification, conventionally evaluated by its coverage and interval length. This work critically examines the sufficiency of these standard metrics. We…

机器学习 · 统计学 2026-01-30 Yizhou Min , Yizhou Lu , Lanqi Li , Zhen Zhang , Jiaye Teng

Bulk-Calibrated Credal Ambiguity Sets: Fast, Tractable Decision Making under Out-of-Sample Contamination

Distributionally robust optimisation (DRO) minimises the worst-case expected loss over an ambiguity set that can capture distributional shifts in out-of-sample environments. While Huber (linear-vacuous) contamination is a classical…

机器学习 · 统计学 2026-01-30 Mengqi Chen , Thomas B. Berrett , Theodoros Damoulas , Michele Caprio

A Flexible Empirical Bayes Approach to Generalized Linear Models, with Applications to Sparse Logistic Regression

We introduce a flexible empirical Bayes approach for fitting Bayesian generalized linear models. Specifically, we adopt a novel mean-field variational inference (VI) method and the prior is estimated within the VI algorithm, making the…

机器学习 · 统计学 2026-01-30 Dongyue Xie , Wanrong Zhu , Matthew Stephens

Multilevel and Sequential Monte Carlo for Training-Free Diffusion Guidance

We address the problem of accurate, training-free guidance for conditional generation in trained diffusion models. Existing methods typically rely on point-estimates to approximate the posterior score, often resulting in biased…

机器学习 · 统计学 2026-01-30 Aidan Gleich , Scott C. Schmidler

An efficient, accurate, and interpretable machine learning method for computing probability of failure

We introduce a novel machine learning method called the Penalized Profile Support Vector Machine based on the Gabriel edited set for the computation of the probability of failure for a complex system as determined by a threshold condition…

机器学习 · 统计学 2026-01-30 Jacob Zhu , Donald Estep

Diffusion-based Annealed Boltzmann Generators : benefits, pitfalls and hopes

Sampling configurations at thermodynamic equilibrium is a central challenge in statistical physics. Boltzmann Generators (BGs) tackle it by combining a generative model with a Monte Carlo (MC) correction step to obtain asymptotically…

机器学习 · 统计学 2026-01-30 Louis Grenioux , Maxence Noble