机器学习 — Scifaro

Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance

The stochastic gradient descent (SGD) algorithm has been widely used to optimize deep Cox neural network (Cox-NN) by updating model parameters using mini-batches of data. We show that SGD aims to optimize the average of mini-batch…

机器学习 · 统计学 2026-04-16 Lang Zeng , Weijing Tang , Zhao Ren , Ying Ding

Nonparametric Sparse Online Learning of the Koopman Operator

The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. However, existing data-driven approaches to learning the Koopman operator rely on batch data. In this work, we present…

机器学习 · 统计学 2026-04-16 Boya Hou , Sina Sanjari , Nathan Dahlin , Alec Koppel , Subhonmesh Bose

New Equivalences Between Interpolation and SVMs: Kernels and Structured Features

The support vector machine (SVM) is a supervised learning algorithm that finds a maximum-margin linear classifier, often after mapping the data to a high-dimensional feature space via the kernel trick. Recent work has demonstrated that in…

机器学习 · 统计学 2026-04-16 Chiraag Kaushik , Andrew D. McRae , Mark A. Davenport , Vidya Muthukumar

Causal Diffusion Models for Counterfactual Outcome Distributions in Longitudinal Data

Predicting counterfactual outcomes in longitudinal data, where sequential treatment decisions heavily depend on evolving patient states, is critical yet notoriously challenging due to complex time-dependent confounding and inadequate…

机器学习 · 统计学 2026-04-15 Farbod Alinezhad , Jianfei Cao , Gary J. Young , Brady Post

A Bayesian Perspective on the Role of Epistemic Uncertainty for Delayed Generalization in In-Context Learning

In-context learning enables transformers to adapt to new tasks from a few examples at inference time, while grokking highlights that this generalization can emerge abruptly only after prolonged training. We study task generalization and…

机器学习 · 统计学 2026-04-15 Abdessamed Qchohi , Simone Rossi

Information-Geometric Decomposition of Generalization Error in Unsupervised Learning

We decompose the Kullback--Leibler generalization error (GE) -- the expected KL divergence from the data distribution to the trained model -- of unsupervised learning into three non-negative components: model error, data bias, and variance.…

机器学习 · 统计学 2026-04-15 Gilhan Kim

A Nonparametric Adaptive EWMA Control Chart for Binary Monitoring of Multiple Stream Processes

Monitoring binomial proportions across multiple independent streams is a critical challenge in Statistical Process Control (SPC), with applications from manufacturing to cybersecurity. While EWMA charts offer sensitivity to small shifts,…

机器学习 · 统计学 2026-04-15 Faruk Muritala , Austin Brown , Dhrubajyoti Ghosh , Sherry Ni

On the continuum limit of t-SNE for data visualization

This work is concerned with the continuum limit of a graph-based data visualization technique called the t-Distributed Stochastic Neighbor Embedding (t-SNE), which is widely used for visualizing data in a variety of applications, but is…

机器学习 · 统计学 2026-04-15 Jeff Calder , Zhonggan Huang , Ryan Murray , Adam Pickarski

Obtaining Partition Crossover masks using Statistical Linkage Learning for solving noised optimization problems with hidden variable dependency structure

In optimization problems, some variable subsets may have a joint non-linear or non-monotonical influence on the function value. Therefore, knowledge of variable dependencies may be crucial for effective optimization, and many…

机器学习 · 统计学 2026-04-15 M. W. Przewozniczek , B. Frej , M. M. Komarnicki , M. Prusik , R. Tinós

Discrete Flow Maps

The sequential nature of autoregressive next-token prediction imposes a fundamental speed limit on large language models. While continuous flow models offer a path to parallel generation, they traditionally demand expensive iterative…

机器学习 · 统计学 2026-04-15 Peter Potaptchik , Jason Yim , Adhi Saravanan , Peter Holderrieth , Eric Vanden-Eijnden , Michael S. Albergo

Experimental Design for Missing Physics

For most process systems, knowledge of the model structure is incomplete. This missing physics must then be learned from experimental data. Recently, a combination of universal differential equations and symbolic regression has become a…

机器学习 · 统计学 2026-04-15 Arno Strouwen , Sebastián Micluţa-Câmpeanu

A Theoretical Comparison of No-U-Turn Sampler Variants: Necessary and Sufficient Convergence Conditions and Mixing Time Analysis under Gaussian Targets

The No-U-Turn Sampler (NUTS) is the computational workhorse of modern Bayesian software libraries, yet its qualitative and quantitative convergence guarantees were established only recently. A significant gap remains in the theoretical…

机器学习 · 统计学 2026-04-15 Samuel Gruffaz , Kyurae Kim , Fares Guehtar , Hadrien Duval-decaix , Pacôme Trautmann

Graphical model for factorization and completion of relatively high rank tensors by sparse sampling

We consider tensor factorizations based on sparse measurements of the components of relatively high rank tensors. The measurements are designed in a way that the underlying graph of interactions is a random graph. The setup will be useful…

机器学习 · 统计学 2026-04-15 Angelo Giorgio Cavaliere , Riki Nagasawa , Shuta Yokoi , Tomoyuki Obuchi , Hajime Yoshino

Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix

Self-attention layers have become fundamental building blocks of modern deep neural networks, yet their theoretical understanding remains limited, particularly from the perspective of random matrix theory. In this work, we provide a…

机器学习 · 统计学 2026-04-15 Tomohiro Hayase , Benoît Collins , Ryo Karakida

On the Convergence Analysis of Muon

The majority of parameters in neural networks are naturally represented as matrices. However, most commonly used optimizers treat these matrix parameters as flattened vectors during optimization, potentially overlooking their inherent…

机器学习 · 统计学 2026-04-15 Wei Shen , Ruichuan Huang , Minhui Huang , Cong Shen , Jiawei Zhang

Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks

Modern applications increasingly involve highly sensitive network data, where raw edges cannot be shared due to privacy constraints. We propose \texttt{TransNet}, a new spectral clustering-based transfer learning framework that improves…

机器学习 · 统计学 2026-04-15 Xiao Guo , Xuming He , Xiangyu Chang , Shujie Ma

The Illusion of Fit: Spatially Resolved Assessment of Constitutive Model Validity in Elastography and Physics-Based Inverse Problems

Inferring the mechanical properties of soft tissues from measured deformations is a fundamental challenge in elastography. A rarely examined assumption underlying existing approaches is that the assumed constitutive law correctly describes…

机器学习 · 统计学 2026-04-15 Vincent C. Scholz , P. S. Koutsourelakis

Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs

The training of neural networks by gradient descent methods is a cornerstone of the deep learning revolution. Yet, despite some recent progress, a complete theory explaining its success is still missing. This article presents, for…

机器学习 · 统计学 2026-04-15 Etienne Boursier , Loucas Pillaud-Vivien , Nicolas Flammarion

ADD for Multi-Bit Image Watermarking

As generative models enable rapid creation of high-fidelity images, societal concerns about misinformation and authenticity have intensified. A promising remedy is multi-bit image watermarking, which embeds a multi-bit message into an image…

机器学习 · 统计学 2026-04-14 An Luo , Jie Ding

Trustworthy Feature Importance Avoids Unrestricted Permutations

Feature importance methods using unrestricted permutations are flawed due to extrapolation errors; such errors appear in all non-trivial variable importance approaches. We propose three new approaches: conditional model reliance and…

机器学习 · 统计学 2026-04-14 Emanuele Borgonovo , Francesco Cappelli , Xuefei Lu , Elmar Plischke , Cynthia Rudin