机器学习 — Scifaro

Efficient Learning of Stationary Diffusions with Stein-type Discrepancies

Learning a stationary diffusion amounts to estimating the parameters of a stochastic differential equation whose stationary distribution matches a target distribution. We build on the recently introduced kernel deviation from stationarity…

机器学习 · 统计学 2026-01-30 Fabian Bleile , Sarah Lumpp , Mathias Drton

Diffusion Models in Simulation-Based Inference: A Tutorial Review

Diffusion models have recently emerged as powerful learners for simulation-based inference (SBI), enabling fast and accurate estimation of latent parameters from simulated and real data. Their score-based formulation offers a flexible way…

机器学习 · 统计学 2026-01-30 Jonas Arruda , Niels Bracher , Ullrich Köthe , Jan Hasenauer , Stefan T. Radev

Online Bayesian Experimental Design for Partially Observed Dynamical Systems

Bayesian experimental design (BED) provides a principled framework for optimizing data collection by choosing experiments that are maximally informative about unknown parameters. However, existing methods cannot deal with the joint…

机器学习 · 统计学 2026-01-30 Sara Pérez-Vieites , Sahel Iqbal , Simo Särkkä , Dominik Baumann

Stochastic Matching Bandits with Rare Optimization Updates

We introduce a bandit framework for stochastic matching under the multinomial logit (MNL) choice model. In our setting, $N$ agents on one side are assigned to $K$ arms on the other side, where each arm stochastically selects an agent from…

机器学习 · 统计学 2026-01-30 Jung-hun Kim , Min-hwan Oh

Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design

We develop a semi-amortized, policy-based, approach to Bayesian experimental design (BED) called Stepwise Deep Adaptive Design (Step-DAD). Like existing, fully amortized, policy-based BED approaches, Step-DAD trains a design policy upfront…

机器学习 · 统计学 2026-01-30 Marcel Hedman , Desi R. Ivanova , Cong Guan , Tom Rainforth

POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes

Dynamic treatment regimes (DTRs) provide a principled framework for optimizing sequential decision-making in domains where decisions must adapt over time in response to individual trajectories, such as healthcare, education, and digital…

机器学习 · 统计学 2026-01-30 Ruijia Zhang , Xiangyu Zhang , Zhengling Qi , Yue Wu , Yanxun Xu

Comparing regularisation paths of (conjugate) gradient estimators in ridge regression

We consider standard gradient descent, gradient flow and conjugate gradients as iterative algorithms for minimising a penalised ridge criterion in linear regression. While it is well known that conjugate gradients exhibit fast numerical…

机器学习 · 统计学 2026-01-30 Laura Hucker , Markus Reiß , Thomas Stark

Rational Gaussian wavelets and corresponding model driven neural networks

In this paper we consider the continuous wavelet transform using Gaussian wavelets multiplied by an appropriate rational term. The zeros and poles of this rational modifier act as free parameters and their choice highly influences the shape…

机器学习 · 统计学 2026-01-30 Attila Miklós Ámon , Kristian Fenech , Péter Kovács , Tamás Dózsa

VSCOUT: A Hybrid Variational Autoencoder Approach to Outlier Detection in High-Dimensional Retrospective Monitoring

Modern industrial and service processes generate high-dimensional, non-Gaussian, and contamination-prone data that challenge the foundational assumptions of classical Statistical Process Control (SPC). Heavy tails, multimodality, nonlinear…

机器学习 · 统计学 2026-01-29 Waldyn G. Martinez

Demystifying Prediction Powered Inference

Machine learning predictions are increasingly used to supplement incomplete or costly-to-measure outcomes in fields such as biomedical research, environmental science, and social science. However, treating predictions as ground truth…

机器学习 · 统计学 2026-01-29 Yilin Song , Dan M. Kluger , Harsh Parikh , Tian Gu

Incorporating data drift to perform survival analysis on credit risk

Survival analysis has become a standard approach for modelling time to default by time-varying covariates in credit risk. Unlike most existing methods that implicitly assume a stationary data-generating process, in practise, mortgage…

机器学习 · 统计学 2026-01-29 Jianwei Peng , Stefan Lessmann

Physics-informed Blind Reconstruction of Dense Fields from Sparse Measurements using Neural Networks with a Differentiable Simulator

Generating dense physical fields from sparse measurements is a fundamental question in sampling, signal processing, and many other applications. State-of-the-art methods either use spatial statistics or rely on examples of dense fields in…

机器学习 · 统计学 2026-01-29 Ofek Aloni , Barak Fishbain

Minimax Rates for Hyperbolic Hierarchical Learning

We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard Lipschitz regularization. For depth-$R$ hierarchies with branching factor $m$, we…

机器学习 · 统计学 2026-01-29 Divit Rawal , Sriram Vishwanath

Deep Neural Networks as Iterated Function Systems and a Generalization Bound

Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability and generalization are typically studied in disparate frameworks and on a case-by-case basis.…

机器学习 · 统计学 2026-01-29 Jonathan Vacher

Recurrent Neural Networks with Linear Structures for Electricity Price Forecasting

We present a novel recurrent neural network architecture specifically designed for day-ahead electricity price forecasting, aimed at improving short-term decision-making and operational management in energy systems. Our combined forecasting…

机器学习 · 统计学 2026-01-29 Souhir Ben Amor , Florian Ziel

Efficient Group Lasso Regularized Rank Regression with Data-Driven Parameter Determination

High-dimensional regression often suffers from heavy-tailed noise and outliers, which can severely undermine the reliability of least-squares based methods. To improve robustness, we adopt a non-smooth Wilcoxon score based rank objective…

机器学习 · 统计学 2026-01-29 Meixia Lin , Meijiao Shi , Yunhai Xiao , Qian Zhang

Online Conformal Model Selection for Nonstationary Time Series

This paper introduces the MPS (Model Prediction Set), a novel framework for online model selection for nonstationary time series. Classical model selection methods, such as information criteria and cross-validation, rely heavily on the…

机器学习 · 统计学 2026-01-29 Shibo Li , Yao Zheng

Analyzing decision tree bias towards the minority class

There is a widespread and longstanding belief that machine learning models are biased towards the majority class when learning from imbalanced binary response data, leading them to neglect or ignore the minority class. Motivated by a recent…

机器学习 · 统计学 2026-01-29 Nathan Phelps , Daniel J. Lizotte , Douglas G. Woolford

Revisiting Incremental Stochastic Majorization-Minimization Algorithms with Applications to Mixture of Experts

Processing high-volume, streaming data is increasingly common in modern statistics and machine learning, where batch-mode algorithms are often impractical because they require repeated passes over the full dataset. This has motivated…

机器学习 · 统计学 2026-01-28 TrungKhang Tran , TrungTin Nguyen , Gersende Fort , Tung Doan , Hien Duy Nguyen , Binh T. Nguyen , Florence Forbes , Christopher Drovandi

Regularized $f$-Divergence Kernel Tests

We propose a framework to construct practical kernel-based two-sample tests from the family of $f$-divergences. The test statistic is computed from the witness function of a regularized variational representation of the divergence, which we…

机器学习 · 统计学 2026-01-28 Mónica Ribero , Antonin Schrab , Arthur Gretton