机器学习 — Scifaro

A note on connections between the F\"ollmer process and the denoising diffusion probabilistic model

The F\"ollmer process is a Brownian motion conditioned to have a pre-specified distribution at time 1. This process can be interpreted as an "augmented" time-compressed version of the reverse stochastic differential equation (SDE) for the…

机器学习 · 统计学 2026-05-19 Yuta Koike

A data-driven Fourier-mixture neural-network method for density estimation

We propose a data-driven Fourier-trained neural-network method for estimating fixed-horizon probability densities from empirical characteristic-function (CF) information. The estimator is a positive Gaussian--Laplace mixture with…

机器学习 · 统计学 2026-05-19 Duy-Minh Dang , Volter Entoma

Simple Approximation and Derivative Free Inference-Time Scaling for Diffusion Models via Sequential Monte Carlo on Path Measures

iffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require repeated…

机器学习 · 统计学 2026-05-19 Chenyang Wang , Weizhong Wang , Yinuo Ren , Jose Blanchet , Yiping Lu

StatQAT: Statistical Quantizer Optimization for Deep Networks

Quantization is essential for reducing the computational cost and memory usage of deep neural networks, enabling efficient inference on low-precision hardware. Despite the growing adoption of uniform and floating-point quantization schemes,…

机器学习 · 统计学 2026-05-19 Mehmet Aktukmak , Daniel Huang , Ke Ding

How does feature learning reshape the function space?

Feature learning is widely regarded as the key mechanism distinguishing neural networks from fixed-kernel methods, yet its impact on the induced function space remains poorly understood. In this work, we precisely characterize how the…

机器学习 · 统计学 2026-05-19 João Lobo , Bruno Loureiro , Long Tran-Than , Fanghui Liu

Online Conformal Prediction for Non-Exchangeable Panel Data

Panel data, in which multiple units are repeatedly observed over time, arise throughout science and engineering. Quantifying predictive uncertainty in such settings is challenging because conformal prediction, while distribution-free and…

机器学习 · 统计学 2026-05-19 Daohong Tu , Kay Giesecke

On Gaussian approximation for entropy-regularized Q-learning with function approximation

In this paper, we derive rates of convergence in the high-dimensional central limit theorem for Polyak--Ruppert averaged iterates generated by entropy-regularized asynchronous Q-learning with linear function approximation and a polynomial…

机器学习 · 统计学 2026-05-19 Artemy Rubtsov , Rahul Singh , Eric Moulines , Alexey Naumov , Sergey Samsonov

Sample efficient inductive matrix completion with noise and inexact side information

Low-rank matrix completion is a widely studied problem with many variants. Inductive matrix completion (IMC) incorporates row and column side information to significantly narrow the search space. Prior work falls into two regimes: methods…

机器学习 · 统计学 2026-05-19 Yuepeng Yang , Cong Ma

Multi-task Linear Regression without Eigenvalue Lower Bounds: Adaptivity, Robustness and Safety

We study the multi-task linear regression problem in the presence of contaminated tasks. We address the setting where the unknown parameters of a majority of tasks are close in the $\ell_2$-norm, while a fraction of tasks are arbitrary…

机器学习 · 统计学 2026-05-19 Seok-Jin Kim

Diffusion-Based Stochastic Operator Networks for Uncertainty Quantification in Stochastic Partial Differential Equations

We introduce a novel framework for uncertainty quantification of solution operators associated with stochastic partial differential equations (SPDEs). Although SPDEs play a central role in modeling complex physical systems under…

机器学习 · 统计学 2026-05-19 Phuoc-Toan Huynh , Richard Archibald , Feng Bao

CAST: Causal Anchored Simplex Transport for Distribution-Valued Time Series

Many decision-facing stochastic systems are observed through aggregate distributions rather than scalar trajectories: queue occupancies, mobility shares, public-health mixtures, generation-source shares, ecological compositions, and…

机器学习 · 统计学 2026-05-19 Jiecheng Lu , Jieqi Di , Runhua Wu , Yuwei Zhou

A Fourier perspective on the learning dynamics of neural networks: from sample complexities to mechanistic insights

Neural networks trained with gradient-based methods exhibit a strong simplicity bias: they learn simpler statistical features of their data before moving to more complex features. Previous analyses of this phenomenon have largely focused on…

机器学习 · 统计学 2026-05-19 Fabiola Ricci , Claudia Merger , Sebastian Goldt

HYVINT: Intensity-Driven Hypergraph Generation with Variational Representations

Hypergraphs provide a principled framework for modeling polyadic interactions, with applications in recommendation systems, social networks, and molecular modeling. Hypergraph generation remains challenging because incidence structures are…

机器学习 · 统计学 2026-05-19 Xinyi Hong , Shuntuo Xu , Zhou Yu

Prediction-Intervention Games and Invariant Sets

We consider the following two-player game: using observational data, the leader chooses a prediction function for a response variable $Y$ from given covariates. The follower then reacts with an intervention on some covariates in the…

机器学习 · 统计学 2026-05-19 Linus Kühne , Felix Schur , Jonas Peters

Isotonic Survival Regression: Calibrated Survival Distributions from Deep Cox Models

Time-to-event data is widespread across the life sciences and engineering, but it is typically encountered together with censoring, which complicates the application of standard machine learning methods. Deep Cox models have emerged as a…

机器学习 · 统计学 2026-05-19 Anchit Jain , Kevin Zhang , Stephen Bates

StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

Diffusion and flow-based models are ubiquitously used for generative modelling and density estimation. They admit a deterministic probability flow ordinary differential equation (PF-ODE), analogous to continuous normalizing flows (CNFs),…

机器学习 · 统计学 2026-05-19 Gurjeet Jagwani , Stephen Thorp , Sinan Deger , Hiranya Peiris

Dimension-Uniform Discretization Analysis of Preconditioned Annealed Langevin Dynamics for Multimodal Gaussian Mixtures

Obtaining stable diffusion-based samplers in high- and infinite-dimensional settings is challenging because errors can accumulate across high-frequency coordinates and make the dynamics unstable under refinement of the finite-dimensional…

机器学习 · 统计学 2026-05-19 Lorenzo Baldassari , Josselin Garnier , Knut Solna , Maarten V. de Hoop

Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: A large-scale benchmark of operator-adaptive PLS and Ridge models

Preprocessing screening is often the most expensive part of a near-infrared spectroscopy calibration workflow. It works because smoothing, derivatives, detrending and related filters change the spectral directions seen by PLS or Ridge…

机器学习 · 统计学 2026-05-19 Gregory Beurier , Robin Reiter , Camille Noûs , Lauriane Rouan , Denis Cornet

SMART Fine-tuning Factor Augmented Neural Lasso

Fine-tuning is a widely used strategy for adapting pre-trained models to new tasks, yet its methodology and theoretical properties in high-dimensional nonparametric settings with variable selection have not yet been developed. We propose a…

机器学习 · 统计学 2026-05-19 Jinhang Chai , Jianqing Fan , Cheng Gao , Qishuo Yin

On the Expressive Power of Contextual Relations in Transformers

Transformer architectures have achieved remarkable empirical success in modeling contextual relations, yet a clear understanding of their expressive power is still lacking. In this work, we introduce a measure-theoretic framework in which…

机器学习 · 统计学 2026-05-19 Demián Fraiman