机器学习 — Scifaro

On Instability of Minimax Optimal Optimism-Based Bandit Algorithms

Statistical inference from data generated by multi-armed bandit (MAB) algorithms is challenging due to their adaptive, non-i.i.d. nature. A classical manifestation is that sample averages of arm rewards under bandit sampling may fail to…

机器学习 · 统计学 2025-11-25 Samya Praharaj , Koulik Khamaru

Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data

Scaling laws describe how learning performance improves with data, compute, or training time, and have become a central theme in modern deep learning. We study this phenomenon in a canonical nonlinear model: phase retrieval with anisotropic…

机器学习 · 统计学 2025-11-25 Guillaume Braun , Bruno Loureiro , Ha Quang Minh , Masaaki Imaizumi

Transforming Conditional Density Estimation Into a Single Nonparametric Regression Task

We propose a way of transforming the problem of conditional density estimation into a single nonparametric regression task via the introduction of auxiliary samples. This allows leveraging regression methods that work well in high…

机器学习 · 统计学 2025-11-25 Alexander G. Reisach , Olivier Collier , Alex Luedtke , Antoine Chambaz

Reliable Selection of Heterogeneous Treatment Effect Estimators

We study the problem of selecting the best heterogeneous treatment effect (HTE) estimator from a collection of candidates in settings where the treatment effect is fundamentally unobserved. We cast estimator selection as a multiple testing…

机器学习 · 统计学 2025-11-25 Jiayi Guo , Zijun Gao

Improving Forecasts of Suicide Attempts for Patients with Little Data

Ecological Momentary Assessment provides real-time data on suicidal thoughts and behaviors, but predicting suicide attempts remains challenging due to their rarity and patient heterogeneity. We show that single models fit to all patients…

机器学习 · 统计学 2025-11-25 Genesis Hang , Annie Chen , Hope Neveux , Matthew K. Nock , Yaniv Yacoby

Sparse Polyak with optimal thresholding operators for high-dimensional M-estimation

We propose and analyze a variant of Sparse Polyak for high dimensional M-estimation problems. Sparse Polyak proposes a novel adaptive step-size rule tailored to suitably estimate the problem's curvature in the high-dimensional setting,…

机器学习 · 统计学 2025-11-25 Tianqi Qiao , Marie Maros

Variational Estimators for Node Popularity Models

Node popularity is recognized as a key factor in modeling real-world networks, capturing heterogeneity in connectivity across communities. This concept is equally important in bipartite networks, where nodes in different partitions may…

机器学习 · 统计学 2025-11-25 Jony Karki , Dongzhou Huang , Yunpeng Zhao

Prequential posteriors

Data assimilation is a fundamental task in updating forecasting models upon observing new data, with applications ranging from weather prediction to online reinforcement learning. Deep generative forecasting models (DGFMs) have shown…

机器学习 · 统计学 2025-11-25 Shreya Sinha-Roy , Richard G. Everitt , Christian P. Robert , Ritabrata Dutta

Quantum Fourier Transform Based Kernel for Solar Irrandiance Forecasting

This study proposes a Quantum Fourier Transform (QFT)-enhanced quantum kernel for short-term time-series forecasting. Each signal is windowed, amplitude-encoded, transformed by a QFT, then passed through a protective rotation layer to avoid…

机器学习 · 统计学 2025-11-25 Nawfel Mechiche-Alami , Eduardo Rodriguez , Jose M. Cardemil , Enrique Lopez Droguett

Perturbing the Derivative: Wild Refitting for Model-Free Evaluation of Machine Learning Models under Bregman Losses

We study the excess risk evaluation of classical penalized empirical risk minimization (ERM) with Bregman losses. We show that by leveraging the idea of wild refitting, one can efficiently upper bound the excess risk through the so-called…

机器学习 · 统计学 2025-11-25 Haichen Hu , David Simchi-Levi

Supervised Dynamic Dimension Reduction with Deep Neural Network

This paper studies the problem of dimension reduction, tailored to improving time series forecasting with high-dimensional predictors. We propose a novel Supervised Deep Dynamic Principal component analysis (SDDP) framework that…

机器学习 · 统计学 2025-11-25 Zhanye Luo , Yuefeng Han , Xiufan Yu

Coupled Entropy: A Goldilocks Generalization for Complex Systems

The coupled entropy is proven to correct a flaw in the derivation of the Tsallis entropy and thereby solidify the theoretical foundations for analyzing the uncertainty of complex systems. The Tsallis entropy originated from considering…

机器学习 · 统计学 2025-11-25 Kenric P. Nelson

On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts

The softmax-contaminated mixture of experts (MoE) model is deployed when a large-scale pre-trained model, which plays the role of a fixed expert, is fine-tuned for learning downstream tasks by including a new contamination part, or prompt,…

机器学习 · 统计学 2025-11-25 Fanqi Yan , Huy Nguyen , Dung Le , Pedram Akbarian , Nhat Ho , Alessandro Rinaldo

(De)-regularized Maximum Mean Discrepancy Gradient Flow

We introduce a (de)-regularization of the Maximum Mean Discrepancy (DrMMD) and its Wasserstein gradient flow. Existing gradient flows that transport samples from source distribution to target distribution with only target samples, either…

机器学习 · 统计学 2025-11-25 Zonghao Chen , Aratrika Mustafi , Pierre Glaser , Anna Korba , Arthur Gretton , Bharath K. Sriperumbudur

A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set

The state-of-the-art methods for estimating high-dimensional covariance matrices all shrink the eigenvalues of the sample covariance matrix towards a data-insensitive shrinkage target. The underlying shrinkage transformation is either…

机器学习 · 统计学 2025-11-25 Man-Chung Yue , Yves Rychener , Daniel Kuhn , Viet Anh Nguyen

Bivariate DeepKriging for Large-scale Spatial Interpolation of Wind Fields

High spatial resolution wind data are essential for a wide range of applications in climate, oceanographic and meteorological studies. Large-scale spatial interpolation or downscaling of bivariate wind fields having velocity in two…

机器学习 · 统计学 2025-11-25 Pratik Nag , Ying Sun , Brian J Reich

Convergence and concentration properties of constant step-size SGD through Markov chains

We consider the optimization of a smooth and strongly convex objective using constant step-size stochastic gradient descent (SGD) and study its properties through the prism of Markov chains. We show that, for unbiased gradient estimates…

机器学习 · 统计学 2025-11-25 Ibrahim Merad , Stéphane Gaïffas

Towards Healing the Blindness of Score Matching

Score-based divergences have been widely used in machine learning and statistics applications. Despite their empirical success, a blindness problem has been observed when using these for multi-modal distributions. In this work, we discuss…

机器学习 · 统计学 2025-11-25 Mingtian Zhang , Oscar Key , Peter Hayes , David Barber , Brooks Paige , François-Xavier Briol

Estimating Bidirectional Causal Effects with Large Scale Online Kernel Learning

In this study, a scalable online kernel learning framework is proposed for estimating bidirectional causal effects in systems characterized by mutual dependence and heteroskedasticity. Traditional causal inference often focuses on…

机器学习 · 统计学 2025-11-24 Masahiro Tanaka

Optimal Convergence Rates of Deep Neural Network Classifiers

In this paper, we study the binary classification problem on $[0,1]^d$ under the Tsybakov noise condition (with exponent $s \in [0,\infty]$) and the compositional assumption. This assumption requires the conditional class probability…

机器学习 · 统计学 2025-11-24 Zihan Zhang , Lei Shi , Ding-Xuan Zhou