机器学习 — Scifaro

Guaranteed Noisy CP Tensor Recovery via Riemannian Optimization on the Segre Manifold

Recovering a low-CP-rank tensor from noisy linear measurements is a central challenge in high-dimensional data analysis, with applications spanning tensor PCA, tensor regression, and beyond. We exploit the intrinsic geometry of rank-one…

机器学习 · 统计学 2025-10-02 Ke Xu , Yuefeng Han

CINDES: Classification induced neural density estimator and simulator

Neural network-based methods for (un)conditional density estimation have recently gained substantial attention, as various neural density estimators have outperformed classical approaches in real-data experiments. Despite these empirical…

机器学习 · 统计学 2025-10-02 Dehao Dai , Jianqing Fan , Yihong Gu , Debarghya Mukherjee

Private Learning of Littlestone Classes, Revisited

We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of…

机器学习 · 统计学 2025-10-02 Xin Lyu

Identifying All {\epsilon}-Best Arms in (Misspecified) Linear Bandits

Motivated by the need to efficiently identify multiple candidates in high trial-and-error cost tasks such as drug discovery, we propose a near-optimal algorithm to identify all {\epsilon}-best arms (i.e., those at most {\epsilon} worse than…

机器学习 · 统计学 2025-10-02 Zhekai Li , Tianyi Ma , Cheng Hua , Ruihao Zhu

Propagating Model Uncertainty through Filtering-based Probabilistic Numerical ODE Solvers

Filtering-based probabilistic numerical solvers for ordinary differential equations (ODEs), also known as ODE filters, have been established as efficient methods for quantifying numerical uncertainty in the solution of ODEs. In practical…

机器学习 · 统计学 2025-10-02 Dingling Yao , Filip Tronarp , Nathanael Bosch

On the Use of Relative Validity Indices for Comparing Clustering Approaches

Relative Validity Indices (RVIs) such as the Silhouette Width Criterion and Davies Bouldin indices are the most widely used tools for evaluating and optimising clustering outcomes. Traditionally, their ability to rank collections of…

机器学习 · 统计学 2025-10-02 Luke W. Yerbury , Ricardo J. G. B. Campello , G. C. Livingston , Mark Goldsworthy , Lachlan O'Neil

Pretrain-Test Task Alignment Governs Generalization in In-Context Learning

In-context learning (ICL) is a central capability of Transformer models, but the structures in data that enable its emergence and govern its robustness remain poorly understood. In this work, we study how the structure of pretraining tasks…

机器学习 · 统计学 2025-10-01 Mary I. Letey , Jacob A. Zavatone-Veth , Yue M. Lu , Cengiz Pehlevan

Spectral gap of Metropolis-within-Gibbs under log-concavity

The Metropolis-within-Gibbs (MwG) algorithm is a widely used Markov Chain Monte Carlo method for sampling from high-dimensional distributions when exact conditional sampling is intractable. We study MwG with Random Walk Metropolis (RWM)…

机器学习 · 统计学 2025-10-01 Cecilia Secchi , Giacomo Zanella

Non-Vacuous Generalization Bounds: Can Rescaling Invariances Help?

A central challenge in understanding generalization is to obtain non-vacuous guarantees that go beyond worst-case complexity over data or weight space. Among existing approaches, PAC-Bayes bounds stand out as they can provide tight,…

机器学习 · 统计学 2025-10-01 Damien Rouchouse , Antoine Gonon , Rémi Gribonval , Benjamin Guedj

Conservative Decisions with Risk Scores

In binary classification applications, conservative decision-making that allows for abstention can be advantageous. To this end, we introduce a novel approach that determines the optimal cutoff interval for risk scores, which can be…

机器学习 · 统计学 2025-10-01 Yishu Wei , Wen-Yee Lee , George Ekow Quaye , Xiaogang Su

Fair Classification by Direct Intervention on Operating Characteristics

We develop new classifiers under group fairness in the attribute-aware setting for binary classification with multiple group fairness constraints (e.g., demographic parity (DP), equalized odds (EO), and predictive parity (PP)). We propose a…

机器学习 · 统计学 2025-10-01 Kevin Jiang , Edgar Dobriban

Neural Optimal Transport Meets Multivariate Conformal Prediction

We propose a framework for conditional vector quantile regression (CVQR) that combines neural optimal transport with amortized optimization, and apply it to multivariate conformal prediction. Classical quantile regression does not extend…

机器学习 · 统计学 2025-10-01 Vladimir Kondratyev , Alexander Fishkov , Nikita Kotelevskii , Mahmoud Hegazy , Remi Flamary , Maxim Panov , Eric Moulines

Regret Analysis of Posterior Sampling-Based Expected Improvement for Bayesian Optimization

Bayesian optimization is a powerful tool for optimizing an expensive-to-evaluate black-box function. In particular, the effectiveness of expected improvement (EI) has been demonstrated in a wide range of applications. However, theoretical…

机器学习 · 统计学 2025-10-01 Shion Takeno , Yu Inatsu , Masayuki Karasuyama , Ichiro Takeuchi

TADA: Improved Diffusion Sampling with Training-free Augmented Dynamics

Diffusion models have demonstrated exceptional capabilities in generating high-fidelity images but typically suffer from inefficient sampling. Many solver designs and noise scheduling strategies have been proposed to dramatically improve…

机器学习 · 统计学 2025-10-01 Tianrong Chen , Huangjie Zheng , David Berthelot , Jiatao Gu , Josh Susskind , Shuangfei Zhai

Fast Likelihood-Free Parameter Estimation for L\'evy Processes

L\'evy processes are widely used in financial modeling due to their ability to capture discontinuities and heavy tails, which are common in high-frequency asset return data. However, parameter estimation remains a challenge when associated…

机器学习 · 统计学 2025-10-01 Nicolas Coloma , William Kleiber

Rethinking Diffusion Model in High Dimension

Curse of Dimensionality is an unavoidable challenge in statistical probability models, yet diffusion models seem to overcome this limitation, achieving impressive results in high-dimensional data generation. Diffusion models assume that…

机器学习 · 统计学 2025-10-01 Zhenxin Zheng , Zhenjie Zheng

A Review on Riemannian Metric Learning: Closer to You than You Imagine

Riemannian metric learning is an emerging field in machine learning, unlocking new ways to encode complex data structures beyond traditional distance metric learning. While classical approaches rely on global distances in Euclidean space,…

机器学习 · 统计学 2025-10-01 Samuel Gruffaz , Josua Sassen

Asymptotic Classification Error for Heavy-Tailed Renewal Processes

Despite the widespread occurrence of classification problems and the increasing collection of point process data across many disciplines, study of error probability for point process classification only emerged very recently. Here, we…

机器学习 · 统计学 2025-10-01 Xinhui Rong , Victor Solo

On Spectral Learning for Odeco Tensors: Perturbation, Initialization, and Algorithms

We study spectral learning for orthogonally decomposable (odeco) tensors, emphasizing the interplay between statistical limits, optimization geometry, and initialization. Unlike matrices, recovery for odeco tensors does not hinge on…

机器学习 · 统计学 2025-09-30 Arnab Auddy , Ming Yuan

Symmetry-Aware Bayesian Optimization via Max Kernels

Bayesian Optimization (BO) is a powerful framework for optimizing noisy, expensive-to-evaluate black-box functions. When the objective exhibits invariances under a group action, exploiting these symmetries can substantially improve BO…

机器学习 · 统计学 2025-09-30 Anthony Bardou , Antoine Gonon , Aryan Ahadinia , Patrick Thiran