机器学习 — Scifaro

Model Selection for Gaussian-gated Gaussian Mixture of Experts Using Dendrograms of Mixing Measures

Mixture of Experts (MoE) models constitute a widely utilized class of ensemble learning approaches in statistics and machine learning, known for their flexibility and computational efficiency. They have become integral components in…

机器学习 · 统计学 2025-05-26 Tuan Thai , TrungTin Nguyen , Dat Do , Nhat Ho , Christopher Drovandi

iLOCO: Distribution-Free Inference for Feature Interactions

Feature importance measures are widely studied and are essential for understanding model behavior, guiding feature selection, and enhancing interpretability. However, many machine learning fitted models involve complex interactions between…

机器学习 · 统计学 2025-05-26 Camille Little , Lili Zheng , Genevera Allen

E$^2$M: Double Bounded $\alpha$-Divergence Optimization for Tensor-based Discrete Density Estimation

Tensor-based discrete density estimation requires flexible modeling and proper divergence criteria to enable effective learning; however, traditional approaches using $\alpha$-divergence face analytical challenges due to the $\alpha$-power…

机器学习 · 统计学 2025-05-26 Kazu Ghalamkari , Jesper Løve Hinrich , Morten Mørup

SLIDE: a surrogate fairness constraint to ensure fairness consistency

As they have a vital effect on social decision makings, AI algorithms should be not only accurate and but also fair. Among various algorithms for fairness AI, learning a prediction model by minimizing the empirical risk (e.g.,…

机器学习 · 统计学 2025-05-26 Kunwoong Kim , Ilsang Ohn , Sara Kim , Yongdai Kim

How high is `high'? Rethinking the roles of dimensionality in topological data analysis and manifold learning

We present a generalised Hanson-Wright inequality and use it to establish new statistical insights into the geometry of data point-clouds. In the setting of a general random function model of data, we clarify the roles played by three…

机器学习 · 统计学 2025-05-23 Hannah Sansford , Nick Whiteley , Patrick Rubin-Delanchy

Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions

Recent advances in generative artificial intelligence (GenAI) models have enabled the generation of personalized content that adapts to up-to-date user context. While personalized decision systems are often modeled using bandit…

机器学习 · 统计学 2025-05-23 Marc Brooks , Gabriel Durham , Kihyuk Hong , Ambuj Tewari

Higher-Order Asymptotics of Test-Time Adaptation for Batch Normalization Statistics

This study develops a higher-order asymptotic framework for test-time adaptation (TTA) of Batch Normalization (BN) statistics under distribution shift by integrating classical Edgeworth expansion and saddlepoint approximation techniques…

机器学习 · 统计学 2025-05-23 Masanari Kimura

Graph-Smoothed Bayesian Black-Box Shift Estimator and Its Information Geometry

Label shift adaptation aims to recover target class priors when the labelled source distribution $P$ and the unlabelled target distribution $Q$ share $P(X \mid Y) = Q(X \mid Y)$ but $P(Y) \neq Q(Y)$. Classical black-box shift estimators…

机器学习 · 统计学 2025-05-23 Masanari Kimura

Generalized Power Priors for Improved Bayesian Inference with Historical Data

The power prior is a class of informative priors designed to incorporate historical data alongside current data in a Bayesian framework. It includes a power parameter that controls the influence of historical data, providing flexibility and…

机器学习 · 统计学 2025-05-23 Masanari Kimura , Howard Bondell

Exponential Convergence of CAVI for Bayesian PCA

Probabilistic principal component analysis (PCA) and its Bayesian variant (BPCA) are widely used for dimension reduction in machine learning and statistics. The main advantage of probabilistic PCA over the traditional formulation is…

机器学习 · 统计学 2025-05-23 Arghya Datta , Philippe Gagnon , Florian Maire

Dimension-adapted Momentum Outscales SGD

We investigate scaling laws for stochastic momentum algorithms with small batch on the power law random features model, parameterized by data complexity, target complexity, and model size. When trained with a stochastic momentum algorithm,…

机器学习 · 统计学 2025-05-23 Damien Ferbach , Katie Everett , Gauthier Gidel , Elliot Paquette , Courtney Paquette

Oh SnapMMD! Forecasting Stochastic Dynamics Beyond the Schr\"odinger Bridge's End

Scientists often want to make predictions beyond the observed time horizon of "snapshot" data following latent stochastic dynamics. For example, in time course single-cell mRNA profiling, scientists have access to cellular transcriptional…

机器学习 · 统计学 2025-05-23 Renato Berlinghieri , Yunyi Shen , Jialong Jiang , Tamara Broderick

CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision

Learning complex functions that involve multi-step reasoning poses a significant challenge for standard supervised learning from input-output examples. Chain-of-thought (CoT) supervision, which provides intermediate reasoning steps together…

机器学习 · 统计学 2025-05-23 Awni Altabaa , Omar Montasser , John Lafferty

A Probabilistic Perspective on Model Collapse

In recent years, model collapse has become a critical issue in language model training, making it essential to understand the underlying mechanisms driving this phenomenon. In this paper, we investigate recursive parametric model training…

机器学习 · 统计学 2025-05-23 Shirong Xu , Hengzhi He , Guang Cheng

Constrained Online Decision-Making: A Unified Framework

Contextual online decision-making problems with constraints appear in a wide range of real-world applications, such as adaptive experimental design under safety constraints, personalized recommendation with resource limits, and dynamic…

机器学习 · 统计学 2025-05-23 Haichen Hu , David Simchi-Levi , Navid Azizan

Function-Space Learning Rates

We consider layerwise function-space learning rates, which measure the magnitude of the change in a neural network's output function in response to an update to a parameter tensor. This contrasts with traditional learning rates, which…

机器学习 · 统计学 2025-05-23 Edward Milsom , Ben Anson , Laurence Aitchison

Scalable Implicit Graphon Learning

Graphons are continuous models that represent the structure of graphs and allow the generation of graphs of varying sizes. We propose Scalable Implicit Graphon Learning (SIGL), a scalable method that combines implicit neural representations…

机器学习 · 统计学 2025-05-23 Ali Azizpour , Nicolas Zilberstein , Santiago Segarra

The Benefit of Being Bayesian in Online Conformal Prediction

Based on the framework of Conformal Prediction (CP), we study the online construction of confidence sets given a black-box machine learning model. By converting the target confidence levels into quantile levels, the problem can be reduced…

机器学习 · 统计学 2025-05-23 Zhiyu Zhang , Zhou Lu , Heng Yang

Information-Theoretic Foundations for Machine Learning

The progress of machine learning over the past decade is undeniable. In retrospect, it is both remarkable and unsettling that this progress was achievable with little to no rigorous theory to guide experimentation. Despite this fact,…

机器学习 · 统计学 2025-05-23 Hong Jun Jeon , Benjamin Van Roy

Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits

Bayesian bandit algorithms with approximate Bayesian inference have been widely used in real-world applications. Despite the superior practical performance, their theoretical justification is less investigated in the literature, especially…

机器学习 · 统计学 2025-05-23 Ziyi Huang , Henry Lam , Haofeng Zhang