机器学习 — Scifaro

Higher-arity PAC learning, VC dimension and packing lemma

The aim of this note is to overview some of our work in Chernikov, Towsner'20 (arXiv:2010.00726) developing higher arity VC theory (VC$_n$ dimension), including a generalization of Haussler packing lemma, and an associated tame (slice-wise)…

机器学习 · 统计学 2025-10-16 Artem Chernikov , Henry Towsner

The Bayesian Approach to Continual Learning: An Overview

Continual learning is an online paradigm where a learner continually accumulates knowledge from different tasks encountered over sequential time steps. Importantly, the learner is required to extend and update its knowledge without…

机器学习 · 统计学 2025-10-16 Tameem Adel

Bayesian Double Descent

Double descent is a phenomenon of over-parameterized statistical models such as deep neural networks which have a re-descending property in their risk function. As the complexity of the model increases, risk exhibits a U-shaped region due…

机器学习 · 统计学 2025-10-16 Nick Polson , Vadim Sokolov

Interventional Processes for Causal Uncertainty Quantification

Reliable uncertainty quantification for causal effects is crucial in various applications, but remains difficult in nonparametric models, particularly for continuous treatments. We introduce IMPspec, a Gaussian process (GP) framework for…

机器学习 · 统计学 2025-10-16 Hugh Dance , Peter Orbanz , Arthur Gretton

Clustering with minimum spanning trees: How good can it be?

Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in…

机器学习 · 统计学 2025-10-16 Marek Gagolewski , Anna Cena , Maciej Bartoszuk , Łukasz Brzozowski

Representation Theorem for Matrix Product States

In this work, we investigate the universal representation capacity of the Matrix Product States (MPS) from the perspective of boolean functions and continuous functions. We show that MPS can accurately realize arbitrary boolean functions by…

机器学习 · 统计学 2025-10-16 Erdong Guo , David Draper

Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps

We develop a unified statistical framework for softmax-gated Gaussian mixture of experts (SGMoE) that addresses three long-standing obstacles in parameter estimation and model selection: (i) non-identifiability of gating parameters up to…

机器学习 · 统计学 2025-10-15 Do Tien Hai , Trung Nguyen Mai , TrungTin Nguyen , Nhat Ho , Binh T. Nguyen , Christopher Drovandi

Universal Adaptive Environment Discovery

An open problem in Machine Learning is how to avoid models to exploit spurious correlations in the data; a famous example is the background-label shortcut in the Waterbirds dataset. A common remedy is to train a model across multiple…

机器学习 · 统计学 2025-10-15 Madi Matymov , Ba-Hien Tran , Maurizio Filippone

Improved Central Limit Theorem and Bootstrap Approximations for Linear Stochastic Approximation

In this paper, we refine the Berry-Esseen bounds for the multivariate normal approximation of Polyak-Ruppert averaged iterates arising from the linear stochastic approximation (LSA) algorithm with decreasing step size. We consider the…

机器学习 · 统计学 2025-10-15 Bogdan Butyrin , Eric Moulines , Alexey Naumov , Sergey Samsonov , Qi-Man Shao , Zhuo-Song Zhang

Learning Latent Energy-Based Models via Interacting Particle Langevin Dynamics

We develop interacting particle algorithms for learning latent variable models with energy-based priors. To do so, we leverage recent developments in particle-based methods for solving maximum marginal likelihood estimation (MMLE) problems.…

机器学习 · 统计学 2025-10-15 Joanna Marks , Tim Y. J. Wang , O. Deniz Akyildiz

Compressibility Measures Complexity: Minimum Description Length Meets Singular Learning Theory

We study neural network compressibility by using singular learning theory to extend the minimum description length (MDL) principle to singular models like neural networks. Through extensive experiments on the Pythia suite with quantization,…

机器学习 · 统计学 2025-10-15 Einar Urdshals , Edmund Lau , Jesse Hoogland , Stan van Wingerden , Daniel Murfet

Statistical Guarantees for High-Dimensional Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) and its Ruppert-Polyak averaged variant (ASGD) lie at the heart of modern large-scale learning, yet their theoretical properties in high-dimensional settings are rarely understood. In this paper, we provide…

机器学习 · 统计学 2025-10-15 Jiaqi Li , Zhipeng Lou , Johannes Schmidt-Hieber , Wei Biao Wu

Simplifying Optimal Transport through Schatten-$p$ Regularization

We propose a new general framework for recovering low-rank structure in optimal transport using Schatten-$p$ norm regularization. Our approach extends existing methods that promote sparse and interpretable transport maps or plans, while…

机器学习 · 统计学 2025-10-15 Tyler Maunu

High-Probability Bounds For Heterogeneous Local Differential Privacy

We study statistical estimation under local differential privacy (LDP) when users may hold heterogeneous privacy levels and accuracy must be guaranteed with high probability. Departing from the common in-expectation analyses, and for…

机器学习 · 统计学 2025-10-15 Maryam Aliakbarpour , Alireza Fallah , Swaha Roy , Ria Stevens

Active Subspaces in Infinite Dimension

Active subspace analysis uses the leading eigenspace of the gradient's second moment to conduct supervised dimension reduction. In this article, we extend this methodology to real-valued functionals on Hilbert space. We define an operator…

机器学习 · 统计学 2025-10-15 Poorbita Kundu , Nathan Wycoff

On Thompson Sampling and Bilateral Uncertainty in Additive Bayesian Optimization

In Bayesian Optimization (BO), additive assumptions can mitigate the twin difficulties of modeling and searching a complex function in high dimension. However, common acquisition functions, like the Additive Lower Confidence Bound, ignore…

机器学习 · 统计学 2025-10-15 Nathan Wycoff

Split Conformal Classification with Unsupervised Calibration

Methods for split conformal prediction leverage calibration samples to transform any prediction rule into a set-prediction rule that complies with a target coverage probability. Existing methods provide remarkably strong performance…

机器学习 · 统计学 2025-10-15 Santiago Mazuelas

SpaPool: Soft Partition Assignment Pooling for__Graph Neural Networks

This paper introduces SpaPool, a novel pooling method that combines the strengths of both dense and sparse techniques for a graph neural network. SpaPool groups vertices into an adaptive number of clusters, leveraging the benefits of both…

机器学习 · 统计学 2025-10-15 Rodrigue Govan , Romane Scherrer , Philippe Fournier-Viger , Nazha Selmaoui-Folcher

An Introduction to Sliced Optimal Transport

Sliced Optimal Transport (SOT) is a rapidly developing branch of optimal transport (OT) that exploits the tractability of one-dimensional OT problems. By combining tools from OT, integral geometry, and computational statistics, SOT enables…

机器学习 · 统计学 2025-10-15 Khai Nguyen

A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics

Contrastive learning -- a modern approach to extract useful representations from unlabeled data by training models to distinguish similar samples from dissimilar ones -- has driven significant progress in foundation models. In this work, we…

机器学习 · 统计学 2025-10-15 Licong Lin , Song Mei