机器学习 — Scifaro

Online Learning with Limited Information in the Sliding Window Model

Motivated by recent work on the experts problem in the streaming model, we consider the experts problem in the sliding window model. The sliding window model is a well-studied model that captures applications such as traffic monitoring,…

机器学习 · 统计学 2026-01-08 Vladimir Braverman , Sumegha Garg , Chen Wang , David P. Woodruff , Samson Zhou

Microeconomic Foundations of Multi-Agent Learning

Modern AI systems increasingly operate inside markets and institutions where data, behavior, and incentives are endogenous. This paper develops an economic foundation for multi-agent learning by studying a principal-agent interaction in a…

机器学习 · 统计学 2026-01-08 Nassim Helou

On the Identifiability of Regime-Switching Models with Multi-Lag Dependencies

Identifiability is central to the interpretability of deep latent variable models, ensuring parameterisations are uniquely determined by the data-generating distribution. However, it remains underexplored for deep regime-switching time…

机器学习 · 统计学 2026-01-08 Carles Balsells-Rodas , Toshiko Matsui , Pedro A. M. Mediano , Yixin Wang , Yingzhen Li

An approach to Fisher-Rao metric for infinite dimensional non-parametric information geometry

Being infinite dimensional, non-parametric information geometry has long faced an "intractability barrier" due to the fact that the Fisher-Rao metric is now a functional incurring difficulties in defining its inverse. This paper introduces…

机器学习 · 统计学 2026-01-08 Bing Cheng , Howell Tong

An Anytime Algorithm for Good Arm Identification

In good arm identification (GAI), the goal is to identify one arm whose average performance exceeds a given threshold, referred to as a good arm, if it exists. Few works have studied GAI in the fixed-budget setting when the sampling budget…

机器学习 · 统计学 2026-01-08 Marc Jourdan , Andrée Delahaye-Duriez , Clémence Réda

Self-Supervised Learning from Noisy and Incomplete Data

Many important problems in science and engineering involve inferring a signal from noisy and/or incomplete observations, where the observation process is known. Historically, this problem has been tackled using hand-crafted regularization…

机器学习 · 统计学 2026-01-07 Julián Tachella , Mike Davies

Fast Conformal Prediction using Conditional Interquantile Intervals

We introduce Conformal Interquantile Regression (CIR), a conformal regression method that efficiently constructs near-minimal prediction intervals with guaranteed coverage. CIR leverages black-box machine learning models to estimate outcome…

机器学习 · 统计学 2026-01-07 Naixin Guo , Rui Luo , Zhixin Zhou

Mitigating Long-Tailed Anomaly Score Distributions with Importance-Weighted Loss

Anomaly detection is crucial in industrial applications for identifying rare and unseen patterns to ensure system reliability. Traditional models, trained on a single class of normal data, struggle with real-world distributions where normal…

机器学习 · 统计学 2026-01-07 Jungi Lee , Jungkwon Kim , Chi Zhang , Sangmin Kim , Kwangsun Yoo , Seok-Joo Byun

Modeling Information Blackouts in Missing Not-At-Random Time Series Data

Large-scale traffic forecasting relies on fixed sensor networks that often exhibit blackouts: contiguous intervals of missing measurements caused by detector or communication failures. These outages are typically handled under a Missing At…

机器学习 · 统计学 2026-01-07 Aman Sunesh , Allan Ma , Siddarth Nilol

Source-Optimal Training is Transfer-Suboptimal

We prove that training a source model optimally for its own task is generically suboptimal when the objective is downstream transfer. We study the source-side optimization problem in L2-SP ridge regression and show a fundamental mismatch…

机器学习 · 统计学 2026-01-07 C. Evans Hedges

Error analysis of a compositional score-based algorithm for simulation-based inference

Simulation-based inference (SBI) has become a widely used framework in applied sciences for estimating the parameters of stochastic models that best explain experimental observations. A central question in this setting is how to effectively…

机器学习 · 统计学 2026-01-07 Camille Touron , Gabriel V. Cardoso , Julyan Arbel , Pedro L. C. Rodrigues

SPARKLE: A Nonparametric Approach for Online Decision-Making with High-Dimensional Covariates

Personalized services are central to today's digital economy, and their sequential decisions are often modeled as contextual bandits. Modern applications pose two main challenges: high-dimensional covariates and the need for nonparametric…

机器学习 · 统计学 2026-01-07 Wenjia Wang , Qingwen Zhang , Xiaowei Zhang

Learning mirror maps in policy mirror descent

Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses numerous algorithms. These algorithms are derived through the selection of a mirror map and enjoy finite-time…

机器学习 · 统计学 2026-01-07 Carlo Alfano , Sebastian Towers , Silvia Sapora , Chris Lu , Patrick Rebeschini

Development of a high-resolution indoor radon map using a new machine learning-based probabilistic model and German radon survey data

Accurate knowledge of indoor radon concentration is crucial for assessing radon-related health effects or identifying radon-prone areas. Indoor radon concentration at the national scale is usually estimated on the basis of extensive…

机器学习 · 统计学 2026-01-07 Eric Petermann , Peter Bossew , Joachim Kemski , Valeria Gruber , Nils Suhr , Bernd Hoffmann

At the Intersection of Deep Sequential Model Framework and State-space Model Framework: Study on Option Pricing

Inference and forecast problems of the nonlinear dynamical system have arisen in a variety of contexts. Reservoir computing and deep sequential models, on the one hand, have demonstrated efficient, robust, and superior performance in…

机器学习 · 统计学 2026-01-07 Ziyang Ding , Sayan Mukherjee

A Multilayered Approach to Classifying Customer Responsiveness and Credit Risk

This study evaluates the performance of various classifiers in three distinct models: response, risk, and response-risk, concerning credit card mail campaigns and default prediction. In the response model, the Extra Trees classifier…

机器学习 · 统计学 2026-01-06 Ayomide Afolabi , Ebere Ogburu , Symon Kimitei

Sparse Convex Biclustering

Biclustering is an essential unsupervised machine learning technique for simultaneously clustering rows and columns of a data matrix, with widespread applications in genomics, transcriptomics, and other high-dimensional omics data. Despite…

机器学习 · 统计学 2026-01-06 Jiakun Jiang , Dewei Xiang , Chenliang Gu , Wei Liu , Binhuan Wang

Deep Linear Discriminant Analysis Revisited

We show that for unconstrained Deep Linear Discriminant Analysis (LDA) classifiers, maximum-likelihood training admits pathological solutions in which class means drift together, covariances collapse, and the learned representation becomes…

机器学习 · 统计学 2026-01-06 Maxat Tezekbayev , Rustem Takhanov , Arman Bolatov , Zhenisbek Assylbekov

Fast Gibbs Sampling on Bayesian Hidden Markov Model with Missing Observations

The Hidden Markov Model (HMM) is a widely-used statistical model for handling sequential data. However, the presence of missing observations in real-world datasets often complicates the application of the model. The EM algorithm and Gibbs…

机器学习 · 统计学 2026-01-06 Dongrong Li , Tianwei Yu , Xiaodan Fan

Evidence Slopes and Effective Dimension in Singular Linear Models

Bayesian model selection commonly relies on Laplace approximation or the Bayesian Information Criterion (BIC), which assume that the effective model dimension equals the number of parameters. Singular learning theory replaces this…

机器学习 · 统计学 2026-01-06 Kalyaan Rao