机器学习 — Scifaro

Rethinking Forward Processes for Score-Based Nonlinear Data Assimilation in High Dimensions

Data assimilation is the process of estimating the state of a dynamical system over time by combining model predictions with measurements. This task becomes challenging when the system is nonlinear and high-dimensional. To address this,…

机器学习 · 统计学 2026-05-22 Eunbi Yoon , Won Chang , Donghan Kim , Dae Wook Kim

The Volterra signature

Modern approaches for learning from non-Markovian time series, such as recurrent neural networks, neural controlled differential equations or transformers, typically rely on implicit memory mechanisms that can be difficult to interpret or…

机器学习 · 统计学 2026-05-22 Paul P. Hager , Fabian N. Harang , Luca Pelizzari , Samy Tindel

A Diffusive Classification Loss for Learning Energy-based Generative Models

Score-based generative models have recently achieved remarkable success. While they are usually parameterized by the score, an alternative way is to use a series of time-dependent energy-based models (EBMs), where the score is obtained from…

机器学习 · 统计学 2026-05-22 RuiKang OuYang , Louis Grenioux , José Miguel Hernández-Lobato

BALLAST: Bayesian Active Learning with Look-ahead Amendment for Sea-drifter Trajectories under Spatio-Temporal Vector Fields

We introduce a formal active learning methodology for guiding the placement of Lagrangian observers to infer time-dependent vector fields -- a key task in oceanography, marine science, and ocean engineering -- using a physics-informed…

机器学习 · 统计学 2026-05-22 Rui-Yang Zhang , Lachlan Astfalck , Edward Cripps , David S. Leslie , Henry B. Moss

Two failure modes of deep transformers and how to avoid them: a unified theory of signal propagation at initialisation

Finding the right initialisation for neural networks is crucial to ensure smooth training and good performance. In transformers, the wrong initialisation can lead to one of two failure modes of self-attention layers: rank collapse, where…

机器学习 · 统计学 2026-05-22 Alessio Giorlandino , Sebastian Goldt

On Statistical Estimation of Edge-Reinforced Random Walks

Reinforced random walks (RRWs), including vertex-reinforced random walks (VRRWs) and edge-reinforced random walks (ERRWs), model random walks where the transition probabilities evolve based on prior visitation history~\cite{mgr, fmk,…

机器学习 · 统计学 2026-05-22 Qinghua , Ding , Venkat Anantharam

Prior shift estimation for positive unlabeled data through the lens of kernel embedding

We study estimation of a class prior for unlabeled target samples which possibly differs from that of source population. Moreover, it is assumed that the source data is partially observable: only samples from the positive class and from the…

机器学习 · 统计学 2026-05-22 Jan Mielniczuk , Wojciech Rejchel , Paweł Teisseyre

Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning

We establish novel and general high-dimensional concentration inequalities and Berry-Esseen bounds for vector-valued martingales induced by Markov chains. We apply these results to analyze the performance of the Temporal Difference (TD)…

机器学习 · 统计学 2026-05-22 Weichen Wu , Yuting Wei , Alessandro Rinaldo

Memorisation, convergence and generalisation in generative models

Generative neural networks learn how to produce highly realistic images from a large, but finite number of examples - or do they simply memorise their training set? To settle this question, Kadkhodaie, Guth, Simoncelli and Mallat (ICLR '24)…

机器学习 · 统计学 2026-05-21 Antoine Maillard , Sebastian Goldt

Semiparametric Efficient Bilevel Gradient Estimation

Functional bilevel methods estimate a lower-level function and plug it into a hypergradient, but this plug-in gradient can retain first-order bias when the lower-level problem is learned nonparametrically. To remove this bias, we develop a…

机器学习 · 统计学 2026-05-21 Fares El Khoury , Houssam Zenati , Nathan Kallus , Michael Arbel , Aurélien Bibaut

Large-Step Training Dynamics of a Two-Factor Linear Transformer Model

Gradient-flow analyses show that simplified linear transformers can learn the in-context linear-regression algorithm, but they do not explain the finite-step behavior of gradient descent at large learning rates. Motivated by empirical work…

机器学习 · 统计学 2026-05-21 Krishnakumar Balasubramanian

Theoretical guidelines for annealed Langevin dynamics in compositional simulation-based inference

Compositional score-based approaches to simulation-based inference (SBI) approximate the posterior over a shared parameter given $n$ independent observations by aggregating individually learned posterior scores: currently, there are two…

机器学习 · 统计学 2026-05-21 Camille Touron , Gabriel V. Cardoso , Julyan Arbel , Pedro L. C. Rodrigues

Federated LoRA Fine-Tuning for LLMs via Collaborative Alignment

Low-rank adaptation (LoRA) has emerged as a powerful tool for parameter-efficient fine-tuning of large language models (LLMs). This paper studies LoRA under a federated learning setting, enabling collaborative fine-tuning across clients…

机器学习 · 统计学 2026-05-21 Shuaida He , Liwen Chen , Long Feng

A Rigorous, Tractable Measure of Model Complexity

An accurate assessment of a model's complexity is crucial for topics such as interpretation, generalization, and model selection. However, most existing complexity measures either rely on heuristic assumptions or are computationally…

机器学习 · 统计学 2026-05-21 Oskar Allerbo , Thomas B. Schön

Conditioning Gaussian Processes on Almost Anything

Gaussian processes (GPs) offer a principled probabilistic model over functions, but exact inference is restricted to the linear-Gaussian regime. We establish an explicit equivalence between GPs and a class of linear diffusion models,…

机器学习 · 统计学 2026-05-21 Henry Moss , Lachlan Astfalck , Thomas Cowperthwaite , Colin Doumont , Sam Willis , Philipp Hennig , Christopher Nemeth , Andrew Zammit-Mangion

Group-Aware Matrix Estimation and Latent Subspace Recovery

Modern matrix completion problems often involve heterogeneous data whose rows simultaneously belong to many meta-categories, such as demographic and age groups in recommendation systems, or region and recording session labels in neural…

机器学习 · 统计学 2026-05-21 Hamza Golubovic , Matthew Shen , Genevera I. Allen , Tarek M. Zikry

Spectral bandits for smooth graph functions with applications in recommender systems

Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this paper, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning…

机器学习 · 统计学 2026-05-21 Tomáš Kocák , Michal Valko , Rémi Munos , Branislav Kveton , Shipra Agrawal

Sample Complexity of Transfer Learning: An Optimal Transport Approach

Transfer learning is an essential technique for many machine learning/AI models of complex structures such as large language models and generative AI. The essence of transfer learning is to leverage knowledge from resolved source tasks for…

机器学习 · 统计学 2026-05-21 Haoyang Cao , Xin Guo , Wenpin Tang , Guan Wang

Contradiction Graphs Determine VC Dimension

We study the contradiction graphs associated with binary concept classes. For a class $H \subseteq \{0,1\}^X$, the order-$m$ contradiction graph $G_m(H)$ has as vertices the $H$-realizable labeled sequences of length $m$, with two vertices…

机器学习 · 统计学 2026-05-21 Jesse Campbell , Daniel Ibaibarriaga , Lev Reyzin

Corrected Integrated Laplace Approximation for Bayesian Inference in Latent Gaussian Models

Latent Gaussian models (LGMs) are a popular class of Bayesian hierarchical models that include Gaussian processes, as well as certain spatial models and mixed-effect models. Efficient Bayesian inference of LGMs often requires marginalizing…

机器学习 · 统计学 2026-05-21 Jinlin Lai , Charles C. Margossian , Daniel R. Sheldon