机器学习 — Scifaro

On the Effect of Instability on Learning Continuous-Time Linear Control Systems

We study the problem of system identification for stochastic continuous-time dynamics, based on a single finite-length state trajectory. We present a method for estimating the possibly unstable open-loop matrix by employing properly…

机器学习 · 统计学 2025-09-30 Reza Sadeghi Hafshejani , Mohamad Kazem Shirani Fradonbeh

Bayesian Autoregressive Online Change-Point Detection with Time-Varying Parameters

Change points in real-world systems mark significant regime shifts in system dynamics, possibly triggered by exogenous or endogenous factors. These points define regimes for the time evolution of the system and are crucial for understanding…

机器学习 · 统计学 2025-09-30 Ioanna-Yvonni Tsaknaki , Fabrizio Lillo , Piero Mazzarisi

Optimal thresholds and algorithms for a model of multi-modal learning in high dimensions

This work explores multi-modal inference in a high-dimensional simplified model, analytically quantifying the performance gain of multi-modal inference over that of analyzing modalities in isolation. We present the Bayes-optimal performance…

机器学习 · 统计学 2025-09-30 Christian Keup , Lenka Zdeborová

SEMF: Supervised Expectation-Maximization Framework for Predicting Intervals

This work introduces the Supervised Expectation-Maximization Framework (SEMF), a versatile and model-agnostic approach for generating prediction intervals with any ML model. SEMF extends the Expectation-Maximization algorithm, traditionally…

机器学习 · 统计学 2025-09-30 Ilia Azizi , Marc-Olivier Boldi , Valérie Chavez-Demoulin

Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap

Doubly robust methods hold considerable promise for off-policy evaluation in Markov decision processes (MDPs) under sequential ignorability: They have been shown to converge as $1/\sqrt{T}$ with the horizon $T$, to be statistically…

机器学习 · 统计学 2025-09-30 Mohammad Mehrabi , Stefan Wager

Network inference via process motifs for lagged correlation in linear stochastic processes

A major challenge for causal inference from time-series data is the trade-off between computational feasibility and accuracy. Motivated by process motifs for lagged covariance in an autoregressive model with slow mean-reversion, we propose…

机器学习 · 统计学 2025-09-30 Alice C. Schwarze , Sara M. Ichinaga , Bingni W. Brunton

Towards Efficient Online Exploration for Reinforcement Learning with Human Feedback

Reinforcement learning with human feedback (RLHF), which learns a reward model from human preference data and then optimizes a policy to favor preferred responses, has emerged as a central paradigm for aligning large language models (LLMs)…

机器学习 · 统计学 2025-09-29 Gen Li , Yuling Yan

Metrics for Parametric Families of Networks

We introduce a general framework for analyzing data modeled as parameterized families of networks. Building on a Gromov-Wasserstein variant of optimal transport, we define a family of parameterized Gromov-Wasserstein distances for comparing…

机器学习 · 统计学 2025-09-29 Mario Gómez , Guanqun Ma , Tom Needham , Bei Wang

Smoothing-Based Conformal Prediction for Balancing Efficiency and Interpretability

Conformal Prediction (CP) is a distribution-free framework for constructing statistically rigorous prediction sets. While popular variants such as CD-split improve CP's efficiency, they often yield prediction sets composed of multiple…

机器学习 · 统计学 2025-09-29 Mingyi Zheng , Hongyu Jiang , Yizhou Lu , Jiaye Teng

Multidimensional Uncertainty Quantification via Optimal Transport

Most uncertainty quantification (UQ) approaches provide a single scalar value as a measure of model reliability. However, different uncertainty measures could provide complementary information on the prediction confidence. Even measures…

机器学习 · 统计学 2025-09-29 Nikita Kotelevskii , Maiya Goloburda , Vladimir Kondratyev , Alexander Fishkov , Mohsen Guizani , Eric Moulines , Maxim Panov

A Random Matrix Perspective of Echo State Networks: From Precise Bias--Variance Characterization to Optimal Regularization

We present a rigorous asymptotic analysis of Echo State Networks (ESNs) in a teacher student setting with a linear teacher with oracle weights. Leveraging random matrix theory, we derive closed form expressions for the asymptotic bias,…

机器学习 · 统计学 2025-09-29 Yessin Moakher , Malik Tiomoko , Cosme Louart , Zhenyu Liao

Causal-EPIG: A Prediction-Oriented Active Learning Framework for CATE Estimation

Estimating the Conditional Average Treatment Effect (CATE) is often constrained by the high cost of obtaining outcome measurements, making active learning essential. However, conventional active learning strategies suffer from a fundamental…

机器学习 · 统计学 2025-09-29 Erdun Gao , Jake Fawkes , Dino Sejdinovic

Effective continuous equations for adaptive SGD: a stochastic analysis view

We present a theoretical analysis of some popular adaptive Stochastic Gradient Descent (SGD) methods in the small learning rate regime. Using the stochastic modified equations framework introduced by Li et al., we derive effective…

机器学习 · 统计学 2025-09-29 Luca Callisti , Marco Romito , Francesco Triggiano

A Unified Empirical Risk Minimization Framework for Flexible N-Tuples Weak Supervision

To alleviate the annotation burden in supervised learning, N-tuples learning has recently emerged as a powerful weakly-supervised method. While existing N-tuples learning approaches extend pairwise learning to higher-order comparisons and…

机器学习 · 统计学 2025-09-29 Shuying Huang , Junpeng Li , Changchun Hua , Yana Yang

Detecting Scarce and Sparse Anomalous: Solving Dual Imbalance in Multi-Instance Learning

In real-world applications, it is highly challenging to detect anomalous samples with extremely sparse anomalies, as they are highly similar to and thus easily confused with normal samples. Moreover, the number of anomalous samples is…

机器学习 · 统计学 2025-09-29 Lin-Han Jia , Lan-Zhe Guo , Zhi Zhou , Si-Ye Han , Zi-Wen Li , Yu-Feng Li

Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

Estimating the uncertainty of responses from Large Language Models (LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically…

机器学习 · 统计学 2025-09-29 Haizhou Shi , Yibin Wang , Ligong Han , Huan Zhang , Hao Wang

Response to Promises and Pitfalls of Deep Kernel Learning

This note responds to "Promises and Pitfalls of Deep Kernel Learning" (Ober et al., 2021). The marginal likelihood of a Gaussian process can be compartmentalized into a data fit term and a complexity penalty. Ober et al. (2021) shows that…

机器学习 · 统计学 2025-09-26 Andrew Gordon Wilson , Zhiting Hu , Ruslan Salakhutdinov , Eric P. Xing

Breaking the curse of dimensionality for linear rules: optimal predictors over the ellipsoid

In this work, we address the following question: What minimal structural assumptions are needed to prevent the degradation of statistical learning bounds with increasing dimensionality? We investigate this question in the classical…

机器学习 · 统计学 2025-09-26 Alexis Ayme , Bruno Loureiro

WISER: Segmenting watermarked region - an epidemic change-point perspective

With the increasing popularity of large language models, concerns over content authenticity have led to the development of myriad watermarking schemes. These schemes can be used to detect a machine-generated text via an appropriate key,…

机器学习 · 统计学 2025-09-26 Soham Bonnerjee , Sayar Karmakar , Subhrajyoty Roy

A Hierarchical Variational Graph Fused Lasso for Recovering Relative Rates in Spatial Compositional Data

The analysis of spatial data from biological imaging technology, such as imaging mass spectrometry (IMS) or imaging mass cytometry (IMC), is challenging because of a competitive sampling process which convolves signals from molecules in a…

机器学习 · 统计学 2025-09-26 Joaquim Valerio Teixeira , Ed Reznik , Sudpito Banerjee , Wesley Tansey