机器学习 — Scifaro

Causal Ordering Without Effect Estimation: A Framework for Using Proxies in Treatment Prioritization

Who should we prioritize for treatment when causal effects cannot be estimated? In practice, organizations often rely on predictive proxies: ads are targeted using purchase probabilities, and retention incentives are allocated using…

机器学习 · 统计学 2025-10-15 Carlos Fernández-Loría , Jorge Loría

torchsom: The Reference PyTorch Library for Self-Organizing Maps

This paper introduces torchsom, an open-source Python library that provides a reference implementation of the Self-Organizing Map (SOM) in PyTorch. This package offers three main features: (i) dimensionality reduction, (ii) clustering, and…

机器学习 · 统计学 2025-10-14 Louis Berthier , Ahmed Shokry , Maxime Moreaud , Guillaume Ramelet , Eric Moulines

How Patterns Dictate Learnability in Sequential Data

Sequential data - ranging from financial time series to natural language - has driven the growing adoption of autoregressive models. However, these algorithms rely on the presence of underlying patterns in the data, and their identification…

机器学习 · 统计学 2025-10-14 Mario Morawski , Anais Despres , Rémi Rehm

Missing Data Multiple Imputation for Tabular Q-Learning in Online RL

Missing data in online reinforcement learning (RL) poses challenges compared to missing data in standard tabular data or in offline policy learning. The need to impute and act at each time step means that imputation cannot be put off until…

机器学习 · 统计学 2025-10-14 Kyla Chasalow , Skyler Wu , Susan Murphy

High-Dimensional Learning Dynamics of Quantized Models with Straight-Through Estimator

Quantized neural network training optimizes a discrete, non-differentiable objective. The straight-through estimator (STE) enables backpropagation through surrogate gradients and is widely used. While previous studies have primarily focused…

机器学习 · 统计学 2025-10-14 Yuma Ichikawa , Shuhei Kashiwamura , Ayaka Sakata

From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation

Generative models form the backbone of modern machine learning, underpinning state-of-the-art systems in text, vision, and multimodal applications. While Maximum Likelihood Estimation has traditionally served as the dominant training…

机器学习 · 统计学 2025-10-14 Abdelhakim Benechehab , Gabriel Singer , Corentin Léger , Youssef Attia El Hili , Giuseppe Paolo , Albert Thomas , Maurizio Filippone , Balázs Kégl

On Experiments

The scientific process is a means to turn the results of experiments into knowledge about the world in which we live. Much research effort has been directed toward automating this process. To do this, one needs to formulate the scientific…

机器学习 · 统计学 2025-10-14 Brendan van Rooyen

One-Stage Top-$k$ Learning-to-Defer: Score-Based Surrogates with Theoretical Guarantees

We introduce the first one-stage Top-$k$ Learning-to-Defer framework, which unifies prediction and deferral by learning a shared score-based model that selects the $k$ most cost-effective entities-labels or experts-per input. While existing…

机器学习 · 统计学 2025-10-14 Yannis Montreuil , Axel Carlier , Lai Xing Ng , Wei Tsang Ooi

Any-stepsize Gradient Descent for Separable Data under Fenchel-Young Losses

The gradient descent (GD) has been one of the most common optimizer in machine learning. In particular, the loss landscape of a neural network is typically sharpened during the initial phase of training, making the training dynamics hover…

机器学习 · 统计学 2025-10-14 Han Bao , Shinsaku Sakaue , Yuki Takezawa

A Unified Information-Theoretic Framework for Meta-Learning Generalization

In recent years, information-theoretic generalization bounds have gained increasing attention for analyzing the generalization capabilities of meta-learning algorithms. However, existing results are confined to two-step bounds, failing to…

机器学习 · 统计学 2025-10-14 Wen Wen , Tieliang Gong , Yuxin Dong , Zeyu Gao , Yong-Jin Liu

An Asymptotically Optimal Coordinate Descent Algorithm for Learning Bayesian Networks from Gaussian Models

This paper studies the problem of learning Bayesian networks from continuous observational data, generated according to a linear Gaussian structural equation model. We consider an $\ell_0$-penalized maximum likelihood estimator for this…

机器学习 · 统计学 2025-10-14 Tong Xu , Simge Küçükyavuz , Ali Shojaie , Armeen Taeb

Deep conditional distribution learning via conditional F\"ollmer flow

We introduce an ordinary differential equation (ODE) based deep generative method for learning conditional distributions, named Conditional F\"ollmer Flow. Starting from a standard Gaussian distribution, the proposed flow could approximate…

机器学习 · 统计学 2025-10-14 Jinyuan Chang , Zhao Ding , Yuling Jiao , Ruoxuan Li , Jerry Zhijian Yang

Sparse Robust Classification via the Kernel Mean

Many leading classification algorithms output a classifier that is a weighted average of kernel evaluations. Optimizing these weights is a nontrivial problem that still attracts much research effort. Furthermore, explaining these methods to…

机器学习 · 统计学 2025-10-14 Brendan van Rooyen , Aditya Krishna Menon , Robert C. Williamson

Interpretable Generative and Discriminative Learning for Multimodal and Incomplete Clinical Data

Real-world clinical problems are often characterized by multimodal data, usually associated with incomplete views and limited sample sizes in their cohorts, posing significant limitations for machine learning algorithms. In this work, we…

机器学习 · 统计学 2025-10-13 Albert Belenguer-Llorens , Carlos Sevilla-Salcedo , Janaina Mourao-Miranda , Vanessa Gómez-Verdejo

A unified Bayesian framework for adversarial robustness

The vulnerability of machine learning models to adversarial attacks remains a critical security challenge. Traditional defenses, such as adversarial training, typically robustify models by minimizing a worst-case loss. However, these…

机器学习 · 统计学 2025-10-13 Pablo G. Arce , Roi Naveiro , David Ríos Insua

Distributionally robust approximation property of neural networks

The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby…

机器学习 · 统计学 2025-10-13 Mihriban Ceylan , David J. Prömel

Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains

We study generative modeling on convex domains using flow matching and mirror maps, and identify two fundamental challenges. First, standard log-barrier mirror maps induce heavy-tailed dual distributions, leading to ill-posed dynamics.…

机器学习 · 统计学 2025-10-13 Yunrui Guan , Krishnakumar Balasubramanian , Shiqian Ma

Gradient-Guided Furthest Point Sampling for Robust Training Set Selection

Smart training set selections procedures enable the reduction of data needs and improves predictive robustness in machine learning problems relevant to chemistry. We introduce Gradient Guided Furthest Point Sampling (GGFPS), a simple…

机器学习 · 统计学 2025-10-13 Morris Trestman , Stefan Gugler , Felix A. Faber , O. A. von Lilienfeld

Gradient-based Sample Selection for Faster Bayesian Optimization

Bayesian optimization (BO) is an effective technique for black-box optimization. However, its applicability is typically limited to moderate-budget problems due to the cubic complexity of fitting the Gaussian process (GP) surrogate model.…

机器学习 · 统计学 2025-10-13 Qiyu Wei , Haowei Wang , Zirui Cao , Songhao Wang , Richard Allmendinger , Mauricio A Álvarez

Multiparameter regularization and aggregation in the context of polynomial functional regression

Most of the recent results in polynomial functional regression have been focused on an in-depth exploration of single-parameter regularization schemes. In contrast, in this study we go beyond that framework by introducing an algorithm for…

机器学习 · 统计学 2025-10-13 Elke R. Gizewski , Markus Holzleitner , Lukas Mayer-Suess , Sergiy Pereverzyev , Sergei V. Pereverzyev