机器学习 — Scifaro

Conic Formulations of Transport Metrics for Unbalanced Measure Networks and Hypernetworks

The Gromov-Wasserstein (GW) variant of optimal transport, designed to compare probability densities defined over distinct metric spaces, has emerged as an important tool for the analysis of data with complex structure, such as ensembles of…

机器学习 · 统计学 2025-08-15 Mary Chriselda Antony Oliver , Emmanuel Hartman , Tom Needham

An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise

Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share a common expectation $\Sigma$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest…

机器学习 · 统计学 2025-08-15 Johanna Düngler , Amartya Sanyal

Dimension-Free Bounds for Generalized First-Order Methods via Gaussian Coupling

We establish non-asymptotic bounds on the finite-sample behavior of generalized first-order iterative algorithms -- including gradient-based optimization methods and approximate message passing (AMP) -- with Gaussian data matrices and…

机器学习 · 统计学 2025-08-15 Galen Reeves

MIRRAMS: Learning Robust Tabular Models under Unseen Missingness Shifts

The presence of missing values often reflects variations in data collection policies, which may shift across time or locations, even when the underlying feature distribution remains stable. Such shifts in the missingness distribution…

机器学习 · 统计学 2025-08-15 Jihye Lee , Minseo Kang , Dongha Kim

A Two-Stage Learning-to-Defer Approach for Multi-Task Learning

The Two-Stage Learning-to-Defer (L2D) framework has been extensively studied for classification and, more recently, regression tasks. However, many real-world applications require solving both tasks jointly in a multi-task setting. We…

机器学习 · 统计学 2025-08-15 Yannis Montreuil , Shu Heng Yeo , Axel Carlier , Lai Xing Ng , Wei Tsang Ooi

Hypothesis Spaces for Deep Learning

This paper introduces a hypothesis space for deep learning based on deep neural networks (DNNs). By treating a DNN as a function of two variables - the input variable and the parameter variable - we consider the set of DNNs where the…

机器学习 · 统计学 2025-08-15 Rui Wang , Yuesheng Xu , Mingsong Yan

Continuous Parallel Relaxation for Finding Diverse Solutions in Combinatorial Optimization Problems

Finding the optimal solution is often the primary goal in combinatorial optimization (CO). However, real-world applications frequently require diverse solutions rather than a single optimum, particularly in two key scenarios. The first…

机器学习 · 统计学 2025-08-15 Yuma Ichikawa , Hiroaki Iwashita

Structured Kernel Regression VAE: A Computationally Efficient Surrogate for GP-VAEs in ICA

The interpretability of generative models is considered a key factor in demonstrating their effectiveness and controllability. The generated data are believed to be determined by latent variables that are not directly observable. Therefore,…

机器学习 · 统计学 2025-08-14 Yuan-Hao Wei , Fu-Hao Deng , Lin-Yong Cui , Yan-Jie Sun

Distributional Sensitivity Analysis: Enabling Differentiability in Sample-Based Inference

We present two analytical formulae for estimating the sensitivity -- namely, the gradient or Jacobian -- at given realizations of an arbitrary-dimensional random vector with respect to its distributional parameters. The first formula…

机器学习 · 统计学 2025-08-14 Pi-Yueh Chuang , Ahmed Attia , Emil Constantinescu

M-learner:A Flexible And Powerful Framework To Study Heterogeneous Treatment Effect In Mediation Model

We propose a novel method, termed the M-learner, for estimating heterogeneous indirect and total treatment effects and identifying relevant subgroups within a mediation framework. The procedure comprises four key steps. First, we compute…

机器学习 · 统计学 2025-08-14 Xingyu Li , Qing Liu , Tony Jiang , Hong Amy Xia , Brian P. Hobbs , Peng Wei

Gradient Descent Algorithm in Hilbert Spaces under Stationary Markov Chains with $\phi$- and $\beta$-Mixing

In this paper, we study a strictly stationary Markov chain gradient descent algorithm operating in general Hilbert spaces. Our analysis focuses on the mixing coefficients of the underlying process, specifically the $\phi$- and…

机器学习 · 统计学 2025-08-14 Priyanka Roy , Susanne Saminger-Platz

A spectral method for multi-view subspace learning using the product of projections

Multi-view data provides complementary information on the same set of observations, with multi-omics and multimodal sensor data being common examples. Analyzing such data typically requires distinguishing between shared (joint) and unique…

机器学习 · 统计学 2025-08-14 Renat Sergazinov , Armeen Taeb , Irina Gaynanova

Importance Corrected Neural JKO Sampling

In order to sample from an unnormalized probability density function, we propose to combine continuous normalizing flows (CNFs) with rejection-resampling steps based on importance weights. We relate the iterative training of CNFs with…

机器学习 · 统计学 2025-08-14 Johannes Hertrich , Robert Gruhlke

Bio-Inspired Artificial Neural Networks based on Predictive Coding

Backpropagation (BP) of errors is the backbone training algorithm for artificial neural networks (ANNs). It updates network weights through gradient descent to minimize a loss function representing the mismatch between predictions and…

机器学习 · 统计学 2025-08-13 Davide Casnici , Charlotte Frenkel , Justin Dauwels

Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction

Recent advances in machine learning have greatly expanded the repertoire of predictive methods for medical imaging. However, the interpretability of complex models remains a challenge, which limits their utility in medical applications.…

机器学习 · 统计学 2025-08-13 Joseph Paillard , Antoine Collas , Denis A. Engemann , Bertrand Thirion

Randomised Postiterations for Calibrated BayesCG

The Bayesian conjugate gradient method offers probabilistic solutions to linear systems but suffers from poor calibration, limiting its utility in uncertainty quantification tasks. Recent approaches leveraging postiterations to construct…

机器学习 · 统计学 2025-08-13 Niall Vyas , Disha Hegde , Jon Cockayne

Online Covariance Estimation in Nonsmooth Stochastic Approximation

We consider applying stochastic approximation (SA) methods to solve nonsmooth variational inclusion problems. Existing studies have shown that the averaged iterates of SA methods exhibit asymptotic normality, with an optimal limiting…

机器学习 · 统计学 2025-08-13 Liwei Jiang , Abhishek Roy , Krishna Balasubramanian , Damek Davis , Dmitriy Drusvyatskiy , Sen Na

fastkqr: A Fast Algorithm for Kernel Quantile Regression

Quantile regression is a powerful tool for robust and heterogeneous learning that has seen applications in a diverse range of applied areas. However, its broader application is often hindered by the substantial computational demands arising…

机器学习 · 统计学 2025-08-13 Qian Tang , Yuwen Gu , Boxiang Wang

Plug-and-Play Posterior Sampling under Mismatched Measurement and Prior Models

Posterior sampling has been shown to be a powerful Bayesian approach for solving imaging inverse problems. The recent plug-and-play unadjusted Langevin algorithm (PnP-ULA) has emerged as a promising method for Monte Carlo sampling and…

机器学习 · 统计学 2025-08-13 Marien Renaud , Jiaming Liu , Valentin de Bortoli , Andrés Almansa , Ulugbek S. Kamilov

Meta Off-Policy Estimation

Off-policy estimation (OPE) methods enable unbiased offline evaluation of recommender systems, directly estimating the online reward some target policy would have obtained, from offline data and with statistical guarantees. The theoretical…

机器学习 · 统计学 2025-08-12 Olivier Jeunen