机器学习 — Scifaro

Determinant Estimation under Memory Constraints and Neural Scaling Laws

Calculating or accurately estimating log-determinants of large positive definite matrices is of fundamental importance in many machine learning tasks. While its cubic computational complexity can already be prohibitive, in modern…

机器学习 · 统计学 2025-07-11 Siavash Ameli , Chris van der Heide , Liam Hodgkinson , Fred Roosta , Michael W. Mahoney

Off-Policy Evaluation Under Nonignorable Missing Data

Off-Policy Evaluation (OPE) aims to estimate the value of a target policy using offline data collected from potentially different policies. In real-world applications, however, logged data often suffers from missingness. While OPE has been…

机器学习 · 统计学 2025-07-10 Han Wang , Yang Xu , Wenbin Lu , Rui Song

Distribution-free inference for LightGBM and GLM with Tweedie loss

Prediction uncertainty quantification is a key research topic in recent years scientific and business problems. In insurance industries (\cite{parodi2023pricing}), assessing the range of possible claim costs for individual drivers improves…

机器学习 · 统计学 2025-07-10 Alokesh Manna , Aditya Vikram Sett , Dipak K. Dey , Yuwen Gu , Elizabeth D. Schifano , Jichao He

Adaptive collaboration for online personalized distributed learning with heterogeneous clients

We study the problem of online personalized decentralized learning with $N$ statistically heterogeneous clients collaborating to accelerate local training. An important challenge in this setting is to select relevant collaborators to reduce…

机器学习 · 统计学 2025-07-10 Constantin Philippenko , Batiste Le Bars , Kevin Scaman , Laurent Massoulié

Fast Gaussian Processes under Monotonicity Constraints

Gaussian processes (GPs) are widely used as surrogate models for complicated functions in scientific and engineering applications. In many cases, prior knowledge about the function to be approximated, such as monotonicity, is available and…

机器学习 · 统计学 2025-07-10 Chao Zhang , Jasper M. Everink , Jakob Sauer Jørgensen

Semi-parametric Functional Classification via Path Signatures Logistic Regression

We propose Path Signatures Logistic Regression (PSLR), a semi-parametric framework for classifying vector-valued functional data with scalar covariates. Classical functional logistic regression models rely on linear assumptions and fixed…

机器学习 · 统计学 2025-07-10 Pengcheng Zeng , Siyuan Jiang

On the Hardness of Unsupervised Domain Adaptation: Optimal Learners and Information-Theoretic Perspective

This paper studies the hardness of unsupervised domain adaptation (UDA) under covariate shift. We model the uncertainty that the learner faces by a distribution $\pi$ in the ground-truth triples $(p, q, f)$ -- which we call a UDA class --…

机器学习 · 统计学 2025-07-10 Zhiyi Dong , Zixuan Liu , Yongyi Mao

Neural Networks for Tamed Milstein Approximation of SDEs with Additive Symmetric Jump Noise Driven by a Poisson Random Measure

This work aims to estimate the drift and diffusion functions in stochastic differential equations (SDEs) driven by a particular class of L\'evy processes with finite jump intensity, using neural networks. We propose a framework that…

机器学习 · 统计学 2025-07-10 Jose-Hermenegildo Ramirez-Gonzalez , Ying Sun

Bayesian Invariance Modeling of Multi-Environment Data

Invariant prediction [Peters et al., 2016] analyzes feature/outcome data from multiple environments to identify invariant features - those with a stable predictive relationship to the outcome. Such features support generalization to new…

机器学习 · 统计学 2025-07-10 Luhuan Wu , Mingzhang Yin , Yixin Wang , John P. Cunningham , David M. Blei

Wild refitting for black box prediction

We describe and analyze a computionally efficient refitting procedure for computing high-probability upper bounds on the instance-wise mean-squared prediction error of penalized nonparametric estimates based on least-squares minimization.…

机器学习 · 统计学 2025-07-10 Martin J. Wainwright

Very fast Bayesian Additive Regression Trees on GPU

Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression technique based on an ensemble of decision trees. It is part of the toolbox of many statisticians. The overall statistical quality of the regression is…

机器学习 · 统计学 2025-07-10 Giacomo Petrillo

Quadratic Gating Mixture of Experts: Statistical Insights into Self-Attention

Mixture of Experts (MoE) models are well known for effectively scaling model capacity while preserving computational overheads. In this paper, we establish a rigorous relation between MoE and the self-attention mechanism, showing that each…

机器学习 · 统计学 2025-07-10 Pedram Akbarian , Huy Nguyen , Xing Han , Nhat Ho

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product

Neural networks trained to solve modular arithmetic tasks exhibit grokking, a phenomenon where the test accuracy starts improving long after the model achieves 100% training accuracy in the training process. It is often taken as an example…

机器学习 · 统计学 2025-07-10 Neil Mallinar , Daniel Beaglehole , Libin Zhu , Adityanarayanan Radhakrishnan , Parthe Pandit , Mikhail Belkin

Nonlinear denoising score matching for enhanced learning of structured distributions

We present a novel method for training score-based generative models which uses nonlinear noising dynamics to improve learning of structured distributions. Generalizing to a nonlinear drift allows for additional structure to be incorporated…

机器学习 · 统计学 2025-07-10 Jeremiah Birrell , Markos A. Katsoulakis , Luc Rey-Bellet , Benjamin J. Zhang , Wei Zhu

Protecting Classifiers From Attacks

In multiple domains such as malware detection, automated driving systems, or fraud detection, classification algorithms are susceptible to being attacked by malicious agents willing to perturb the value of instance covariates to pursue…

机器学习 · 统计学 2025-07-10 Victor Gallego , Roi Naveiro , Alberto Redondo , David Rios Insua , Fabrizio Ruggeri

Estimating prevalence with precision and accuracy

Unlike classification, whose goal is to estimate the class of each data point in a dataset, prevalence estimation or quantification is a task that aims to estimate the distribution of classes in a dataset. The two main tasks in prevalence…

机器学习 · 统计学 2025-07-09 Aime Bienfait Igiraneza , Christophe Fraser , Robert Hinch

Kernel Trace Distance: Quantum Statistical Metric between Measures through RKHS Density Operators

Distances between probability distributions are a key component of many statistical machine learning tasks, from two-sample testing to generative modeling, among others. We introduce a novel distance between measures that compares them…

机器学习 · 统计学 2025-07-09 Arturo Castellanos , Anna Korba , Pavlo Mozharovskyi , Hicham Janati

Online Regularized Learning Algorithms in RKHS with $\beta$- and $\phi$-Mixing Sequences

In this paper, we study an online regularized learning algorithm in a reproducing kernel Hilbert spaces (RKHS) based on a class of dependent processes. We choose such a process where the degree of dependence is measured by mixing…

机器学习 · 统计学 2025-07-09 Priyanka Roy , Susanne Saminger-Platz

Best-of-N through the Smoothing Lens: KL Divergence and Regret Analysis

A simple yet effective method for inference-time alignment of generative models is Best-of-$N$ (BoN), where $N$ outcomes are sampled from a reference policy, evaluated using a proxy reward model, and the highest-scoring one is selected.…

机器学习 · 统计学 2025-07-09 Gholamali Aminian , Idan Shenfeld , Amir R. Asadi , Ahmad Beirami , Youssef Mroueh

Efficient Risk-sensitive Planning via Entropic Risk Measures

Risk-sensitive planning aims to identify policies maximizing some tail-focused metrics in Markov Decision Processes (MDPs). Such an optimization task can be very costly for the most widely used and interpretable metrics such as threshold…

机器学习 · 统计学 2025-07-09 Alexandre Marthe , Samuel Bounan , Aurélien Garivier , Claire Vernade