机器学习 — Scifaro

Accelerating Constrained Sampling: A Large Deviations Approach

The problem of sampling a target probability distribution on a constrained domain arises in many applications including machine learning. For constrained sampling, various Langevin algorithms such as projected Langevin Monte Carlo (PLMC),…

机器学习 · 统计学 2026-04-07 Yingli Wang , Changwei Tu , Xiaoyu Wang , Lingjiong Zhu

Operator Learning for Schr\"{o}dinger Equation: Unitarity, Error Bounds, and Time Generalization

We consider the problem of learning the evolution operator for the time-dependent Schr\"{o}dinger equation, where the Hamiltonian may vary with time. Existing neural network-based surrogates often ignore fundamental properties of the…

机器学习 · 统计学 2026-04-07 Yash Patel , Unique Subedi , Ambuj Tewari

Sparse Max-Affine Regression

This paper presents Sparse Gradient Descent as a solution for variable selection in convex piecewise linear regression, where the model is given as the maximum of $k$-affine functions $ x \mapsto \max_{j \in [k]} \langle a_j^\star, x…

机器学习 · 统计学 2026-04-07 Haitham Kanj , Seonho Kim , Kiryung Lee

MissNODAG: Differentiable Cyclic Causal Graph Learning from Incomplete Data

Causal discovery in real-world systems, such as biological networks, is often complicated by feedback loops and incomplete data. Standard algorithms, which assume acyclic structures or fully observed data, struggle with these challenges. To…

机器学习 · 统计学 2026-04-07 Muralikrishnna G. Sethuraman , Razieh Nabi , Faramarz Fekri

Importance Sparsification for Sinkhorn Algorithm

Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To…

机器学习 · 统计学 2026-04-07 Mengyu Li , Jun Yu , Tao Li , Cheng Meng

Piecewise Deterministic Markov Processes for Bayesian Neural Networks

Inference on modern Bayesian Neural Networks (BNNs) often relies on a variational inference treatment, imposing violated assumptions of independence and the form of the posterior. Traditional MCMC approaches avoid these assumptions at the…

机器学习 · 统计学 2026-04-07 Ethan Goan , Dimitri Perrin , Kerrie Mengersen , Clinton Fookes

Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk Minimization

We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussian Min-Max Theorem (CGMT) to non-Gaussian settings, we derive an asymptotic min-max…

机器学习 · 统计学 2026-04-06 Chiheb Yaakoubi , Cosme Louart , Malik Tiomoko , Zhenyu Liao

Inversion-Free Natural Gradient Descent on Riemannian Manifolds

The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions…

机器学习 · 统计学 2026-04-06 Dario Draca , Takuo Matsubara , Minh-Ngoc Tran

Lipschitz bounds for integral kernels

Feature maps associated with positive definite kernels play a central role in kernel methods and learning theory, where regularity properties such as Lipschitz continuity are closely related to robustness and stability guarantees. Despite…

机器学习 · 统计学 2026-04-06 Justin Reverdi , Sixin Zhang , Fabrice Gamboa , Serge Gratton

State estimations and noise identifications with intermittent corrupted observations via Bayesian variational inference

This paper focuses on the state estimation problem in distributed sensor networks, where intermittent packet dropouts, corrupted observations, and unknown noise covariances coexist. To tackle this challenge, we formulate the joint…

机器学习 · 统计学 2026-04-06 Peng Sun , Ruoyu Wang , Xue Luo

Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal Transport

Multi-view data analysis seeks to integrate multiple representations of the same samples in order to recover a coherent low-dimensional structure. Classical approaches often rely on feature concatenation or explicit alignment assumptions,…

机器学习 · 统计学 2026-04-06 Rafael Pereira Eufrazio , Eduardo Fernandes Montesuma , Charles Casimiro Cavalcante

Learning interacting particle systems from unlabeled data

Learning the potentials of interacting particle systems is a fundamental task across various scientific disciplines. A major challenge is that unlabeled data collected at discrete time points lack trajectory information due to limitations…

机器学习 · 统计学 2026-04-06 Viska Wei , Fei Lu

Reinforcement Learning from Human Feedback: A Statistical Perspective

Reinforcement learning from human feedback (RLHF) has emerged as a central framework for aligning large language models (LLMs) with human preferences. Despite its practical success, RLHF raises fundamental statistical questions because it…

机器学习 · 统计学 2026-04-06 Pangpang Liu , Chengchun Shi , Will Wei Sun

Functional Natural Policy Gradients

We propose a cross-fitted debiasing device for policy learning from offline data. A key consequence of the resulting learning principle is $\sqrt N$ regret even for policy classes with complexity greater than Donsker, provided a…

机器学习 · 统计学 2026-04-06 Aurelien Bibaut , Houssam Zenati , Thibaud Rahier , Nathan Kallus

Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation Mechanisms

We study privacy-preserving sparse linear regression in the high-dimensional regime, focusing on the LASSO estimator. We analyze two widely used mechanisms for differential privacy: output perturbation, which injects noise into the…

机器学习 · 统计学 2026-04-06 Ayaka Sakata , Haruka Tanzawa

Fast Best-in-Class Regret for Contextual Bandits

We study the problem of stochastic contextual bandits in the agnostic setting, where the goal is to compete with the best policy in a given class without assuming realizability or imposing model restrictions on losses or rewards. In this…

机器学习 · 统计学 2026-04-06 Samuel Girard , Aurelien Bibaut , Arthur Gretton , Nathan Kallus , Houssam Zenati

Adaptive randomized pivoting and volume sampling

Adaptive randomized pivoting (ARP) is a recently proposed and highly effective algorithm for column subset selection. This paper reinterprets the ARP algorithm by drawing connections to the volume sampling distribution and active learning…

机器学习 · 统计学 2026-04-06 Ethan N. Epperly

Learn then Decide: A Learning Approach for Designing Data Marketplaces

As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price…

机器学习 · 统计学 2026-04-06 Yingqi Gao , Wenlu Xu , Jin J. Zhou , Hua Zhou , Yong Chen , Xiaowu Dai

Central Limit Theorems for Stochastic Gradient Descent Quantile Estimators

This paper develops asymptotic theory for quantile estimation via stochastic gradient descent (SGD) with a constant learning rate. The quantile loss function is neither smooth nor strongly convex. Beyond conventional perspectives and…

机器学习 · 统计学 2026-04-06 Ziyang Wei , Jiaqi Li , Likai Chen , Wei Biao Wu

BVFLMSP : Bayesian Vertical Federated Learning for Multimodal Survival with Privacy

Multimodal time-to-event prediction often requires integrating sensitive data distributed across multiple parties, making centralized model training impractical due to privacy constraints. At the same time, most existing multimodal survival…

机器学习 · 统计学 2026-04-03 Abhilash Kar , Basisth Saha , Tanmay Sen , Biswabrata Pradhan