机器学习 — Scifaro

Deflation-Free Optimal Scoring

Sparse Optimal Scoring (SOS) reformulates linear discriminant analysis to enable feature selection through elastic net regularization, making it well-suited for high-dimensional settings where the number of features exceeds observations.…

机器学习 · 统计学 2026-04-29 Sharmin Afroz , Brendan Ames

Residual-loss Anomaly Analysis of Physics-Informed Neural Networks: An Inverse Method for Change-point Detection in Nonlinear Dynamical Systems with Regime Switching

Nonlinear dynamical systems with regime transitions are typically described by ordinary differential equations with jumping parameters parameters. Traditional methods often treat change-point detection and parameter estimation as separate…

机器学习 · 统计学 2026-04-29 Yuhe Bai , Chengli Tan , Jiaqi Li , Xiangjun Wang , Zhikun Zhang

Spectral bandits

Smooth functions on graphs have wide applications in manifold and semi-supervised learning. In this work, we study a bandit problem where the payoffs of arms are smooth on a graph. This framework is suitable for solving online learning…

机器学习 · 统计学 2026-04-29 Tomáš Kocák , Rémi Munos , Branislav Kveton , Shipra Agrawal , Michal Valko

Online learning with Erd\H{o}s-R\'enyi side-observation graphs

We consider adversarial multi-armed bandit problems where the learner is allowed to observe losses of a number of arms beside the arm that it actually chose. We study the case where all non-chosen arms reveal their loss with a fixed but…

机器学习 · 统计学 2026-04-29 Tomáš Kocák , Gergely Neu , Michal Valko

Elite-Driven Support Vector Machines for Classification

Support vector machines (SVMs) are a standard tool for binary classification, but their classical formulations are purely data-driven and offer no direct way to encode trusted benchmark models or structured preferences on selected subsets…

机器学习 · 统计学 2026-04-29 Mohammad Jafari Jozani , Bahram Moeinianfar

A Finite Time Analysis of Thompson Sampling for Bayesian Optimization with Preferential Feedback

Preference feedback, in the form of pairwise comparisons rather than scalar scores, has seen increasing use in applications such as human-, laboratory-, and expert-in-the-loop design, as well as scientific discovery. We propose a Thompson…

机器学习 · 统计学 2026-04-29 Joseph Lazzaro , Davide Buffelli , Da-shan Shiu , Sattar Vakili

StrADiff: A Structured Source-Wise Adaptive Diffusion Framework for Linear and Nonlinear Blind Source Separation

This paper presents StrADiff, a Structured Source-Wise Adaptive Diffusion Framework for unsupervised blind source separation under linear and nonlinear mixing. The framework treats each latent dimension as a source branch and assigns to it…

机器学习 · 统计学 2026-04-29 Yuan-Hao Wei

Minimax Generalized Cross-Entropy

Loss functions play a central role in supervised classification. Cross-entropy (CE) is widely used, whereas the mean absolute error (MAE) loss can offer robustness but is difficult to optimize. Interpolating between the CE and MAE losses,…

机器学习 · 统计学 2026-04-29 Kartheek Bondugula , Santiago Mazuelas , Aritz Pérez , Anqi Liu

Provable Accelerated Bayesian Optimization with Knowledge Transfer

We study how to accelerate Bayesian optimization (BO) on a target task by transferring historical knowledge from related source tasks. Existing work on BO with knowledge transfer either lacks theoretical guarantees or achieves the same…

机器学习 · 统计学 2026-04-29 Haitao Lin , Boxin Zhao , Mladen Kolar , Chong Liu

Extreme bandits

In many areas of medicine, security, and life sciences, we want to allocate limited resources to different sources in order to detect extreme values. In this paper, we study an efficient way to allocate these resources sequentially under…

机器学习 · 统计学 2026-04-28 Alexandra Carpentier , Michal Valko

A Divergence-Based Method for Weighting and Averaging Model Predictions

This paper uses a minimum divergence framework to introduce a new way of calculating model weights that can be used to average probabilistic predictions from statistical and machine learning models. The method is general and can be applied…

机器学习 · 统计学 2026-04-28 Olav Benjamin Vassend

Conditional Score-Based Modeling of Effective Langevin Dynamics

Stochastic reduced-order models are widely used to represent the effective dynamics of complex systems, but estimating their drift and diffusion coefficients from data remains challenging. Standard approaches often rely on short-time…

机器学习 · 统计学 2026-04-28 Ludovico T. Giorgini

High-dimensional Semi-supervised Classification via the Fermat Distance

Semi-supervised classification, where unlabeled data are massive but labeled data are limited, often arises in machine learning applications. We address this challenge under high-dimensional data by leveraging the manifold and cluster…

机器学习 · 统计学 2026-04-28 Ruoxu Tan , Yiming Zang

Inference of Online Newton Methods with Nesterov's Accelerated Sketching

Reliable decision-making with streaming data requires principled uncertainty quantification of online methods. While first-order methods enable efficient iterate updates, their inference procedures still require updating proper (covariance)…

机器学习 · 统计学 2026-04-28 Haoxuan Wang , Xinchen Du , Sen Na

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Existing large-dimensional theory for spectral algorithms resolves either the optimally tuned point or the interpolation limit, but leaves the under-regularized regime unexplored. We study the learning curve and benign overfitting of…

机器学习 · 统计学 2026-04-28 Weihao Lu , Qian Lin , Yingcun Xia , Dongming Huang

MOCA: A Transformer-based Modular Causal Inference Framework with One-way Cross-attention and Cutting Feedback

Causal effect estimation from observational data requires careful adjustment for confounding. Classical estimators such as inverse probability weighting and augmented inverse probability weighting are effective under favorable model…

机器学习 · 统计学 2026-04-28 Lei Wang , Debashis Ghosh

Turtle shell clustering: A mixture approach to discriminative clustering with applications to flow cytometry and other data

Generative approaches to clustering provide information on geometric properties of clusters, whereas discriminative approaches provide boundaries between clusters. Ideas from both approaches are incorporated to present a fully unsupervised,…

机器学习 · 统计学 2026-04-28 Mackenzie R. Neal , Paul D. McNicholas , Arthur White

Rethinking Trust Region Bayesian Optimization in High Dimensions

Trust Region Bayesian Optimization (TuRBO) is an effective strategy for alleviating the curse of dimensionality in high-dimensional black-box optimization. However, inappropriate lengthscale design can cause the local Gaussian process (GP)…

机器学习 · 统计学 2026-04-28 Wei-Ting Tang , Joel A. Paulson

On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization

Motivated by the principle of satisficing in decision-making, we study satisficing regret guarantees for nonstationary $K$-armed bandits. We show that in the general realizable, piecewise-stationary setting with $L$ stationary segments, the…

机器学习 · 统计学 2026-04-28 Yixuan Zhang , Ruihao Zhu , Qiaomin Xie

Shuffle and Joint Differential Privacy for Generalized Linear Contextual Bandits

We present the first algorithms for generalized linear contextual bandits under shuffle differential privacy and joint differential privacy. While prior work on private contextual bandits has been restricted to linear reward models -- which…

机器学习 · 统计学 2026-04-28 Sahasrajit Sarmasarkar