机器学习 — Scifaro

The Cosine Schedule is Fisher-Rao-Optimal for Masked Discrete Diffusion Models

In this work, we study the problem of choosing the discretisation schedule for sampling from masked discrete diffusion models in terms of the information geometry of the induced probability path. Specifically, we show that the optimal…

机器学习 · 统计学 2025-10-07 Leo Zhang , Saifuddin Syed

Uniform convergence of the smooth calibration error and its relationship with functional gradient

Calibration is a critical requirement for reliable probabilistic prediction, especially in high-risk applications. However, the theoretical understanding of which learning algorithms can simultaneously achieve high accuracy and good…

机器学习 · 统计学 2025-10-07 Futoshi Futami , Atsushi Nitanda

Critical Points of Random Neural Networks

This work investigates the expected number of critical points of random neural networks with different activation functions as the depth increases in the infinite-width limit. Under suitable regularity conditions, we derive precise…

机器学习 · 统计学 2025-10-07 Simmaco Di Lillo

Graph Alignment via Birkhoff Relaxation

We consider the graph alignment problem, wherein the objective is to find a vertex correspondence between two graphs that maximizes the edge overlap. The graph alignment problem is an instance of the quadratic assignment problem (QAP),…

机器学习 · 统计学 2025-10-07 Sushil Mahavir Varma , Irène Waldspurger , Laurent Massoulié

Vector Copula Variational Inference and Dependent Block Posterior Approximations

The key to VI is the selection of a tractable density to approximate the Bayesian posterior. For large and complex models a common choice is to assume independence between multivariate blocks in a partition of the parameter space. While…

机器学习 · 统计学 2025-10-07 Yu Fu , Michael Stanley Smith , Anastasios Panagiotelis

Efficient Sparsification of Simplicial Complexes via Local Densities of States

Simplicial complexes (SCs) have become a popular abstraction for analyzing complex data using tools from topological data analysis or topological signal processing. However, the analysis of many real-world datasets often leads to dense SCs,…

机器学习 · 统计学 2025-10-07 Anton Savostianov , Michael T. Schaub , Nicola Guglielmi , Francesco Tudisco

A Statistical Hypothesis Testing Framework for Data Misappropriation Detection in Large Language Models

Large Language Models (LLMs) are rapidly gaining enormous popularity in recent years. However, the training of LLMs has raised significant privacy and legal concerns, particularly regarding the distillation and inclusion of copyrighted…

机器学习 · 统计学 2025-10-07 Yinpeng Cai , Lexin Li , Linjun Zhang

Counterfactual explainability and analysis of variance

Existing tools for explaining complex models and systems are associational rather than causal and do not provide mechanistic understanding. We propose a new notion called counterfactual explainability for causal attribution that is…

机器学习 · 统计学 2025-10-07 Zijun Gao , Qingyuan Zhao

Machine Learning for Inverse Problems and Data Assimilation

The aim of these notes is to demonstrate the potential for ideas in machine learning to impact on the fields of inverse problems and data assimilation. The perspective is one that is primarily aimed at researchers from inverse problems…

机器学习 · 统计学 2025-10-07 Eviatar Bach , Ricardo Baptista , Daniel Sanz-Alonso , Andrew Stuart

Approximation Bounds for Recurrent Neural Networks with Application to Regression

We study the approximation capacity of deep ReLU recurrent neural networks (RNNs) and explore the convergence properties of nonparametric least squares regression using RNNs. We derive upper bounds on the approximation error of RNNs for…

机器学习 · 统计学 2025-10-07 Yuling Jiao , Yang Wang , Bokai Yan

Universality of Kernel Random Matrices and Kernel Regression in the Quadratic Regime

Kernel ridge regression (KRR) is a popular class of machine learning models that has become an important tool for understanding deep learning. Much of the focus thus far has been on studying the proportional asymptotic regime, $n \asymp d$,…

机器学习 · 统计学 2025-10-07 Parthe Pandit , Zhichao Wang , Yizhe Zhu

Sharp Generalization for Nonparametric Regression in Interpolation Space by Over-Parameterized Neural Networks Trained with Preconditioned Gradient Descent and Early Stopping

We study nonparametric regression using an over-parameterized two-layer neural networks trained with algorithmic guarantees in this paper. We consider the setting where the training features are drawn uniformly from the unit sphere in…

机器学习 · 统计学 2025-10-07 Yingzhen Yang , Ping Li

Fitted value iteration methods for bicausal optimal transport

We develop a fitted value iteration (FVI) method to compute bicausal optimal transport (OT) where couplings have an adapted structure. Based on the dynamic programming formulation, FVI adopts a function class to approximate the value…

机器学习 · 统计学 2025-10-07 Erhan Bayraktar , Bingyan Han

Mixtures of Gaussian Process Experts with SMC$^2$

Gaussian processes are a key component of many flexible statistical and machine learning models. However, they exhibit cubic computational complexity and high memory constraints due to the need of inverting and storing a full covariance…

机器学习 · 统计学 2025-10-07 Teemu Härkönen , Sara Wade , Kody Law , Lassi Roininen

Neural Jump ODEs as Generative Models

In this work, we explore how Neural Jump ODEs (NJODEs) can be used as generative models for It\^o processes. Given (discrete observations of) samples of a fixed underlying It\^o process, the NJODE framework can be used to approximate the…

机器学习 · 统计学 2025-10-06 Robert A. Crowell , Florian Krach , Josef Teichmann

Learning Multi-Index Models with Hyper-Kernel Ridge Regression

Deep neural networks excel in high-dimensional problems, outperforming models such as kernel methods, which suffer from the curse of dimensionality. However, the theoretical foundations of this success remain poorly understood. We follow…

机器学习 · 统计学 2025-10-06 Shuo Huang , Hippolyte Labarrière , Ernesto De Vito , Tomaso Poggio , Lorenzo Rosasco

Beyond Linear Diffusions: Improved Representations for Rare Conditional Generative Modeling

Diffusion models have emerged as powerful generative frameworks with widespread applications across machine learning and artificial intelligence systems. While current research has predominantly focused on linear diffusions, these…

机器学习 · 统计学 2025-10-06 Kulunu Dharmakeerthi , Yousef El-Laham , Henry H. Wong , Vamsi K. Potluru , Changhong He , Taosong He

Iteratively reweighted kernel machines efficiently learn sparse functions

The impressive practical performance of neural networks is often attributed to their ability to learn low-dimensional data representations and hierarchical structure directly from data. In this work, we argue that these two phenomena are…

机器学习 · 统计学 2025-10-06 Libin Zhu , Damek Davis , Dmitriy Drusvyatskiy , Maryam Fazel

Fractional signature: a generalisation of the signature inspired by fractional calculus

In this paper, we propose a novel generalisation of the signature of a path, motivated by fractional calculus, which is able to describe the solutions of linear Caputo controlled FDEs. We also propose another generalisation of the…

机器学习 · 统计学 2025-10-06 José Manuel Corcuera , Rubén Jiménez

Asymptotic theory of in-context learning by linear attention

Transformers have a remarkable ability to learn and execute tasks based on examples provided within the input itself, without explicit prior training. It has been argued that this capability, known as in-context learning (ICL), is a…

机器学习 · 统计学 2025-10-06 Yue M. Lu , Mary I. Letey , Jacob A. Zavatone-Veth , Anindita Maiti , Cengiz Pehlevan