机器学习 — Scifaro

Drift Estimation for Diffusion Processes Using Neural Networks Based on Discretely Observed Independent Paths

This paper addresses the nonparametric estimation of the drift function over a compact domain for a time-homogeneous diffusion process, based on high-frequency discrete observations from $N$ independent trajectories. We propose a neural…

机器学习 · 统计学 2026-04-01 Yuzhen Zhao , Yating Liu , Marc Hoffmann

Local Causal Discovery for Statistically Efficient Causal Inference

Causal discovery methods can identify valid adjustment sets for causal effect estimation for a pair of target variables, even when the underlying causal graph is unknown. Global causal discovery methods focus on learning the whole causal…

机器学习 · 统计学 2026-04-01 Mátyás Schubert , Tom Claassen , Sara Magliacane

Understanding and Improving Shampoo and SOAP via Kullback-Leibler Minimization

Shampoo and its efficient variant, SOAP, employ structured second-moment estimations and have shown strong performance for training neural networks (NNs). In practice, however, Shampoo typically requires step-size grafting with Adam to be…

机器学习 · 统计学 2026-04-01 Wu Lin , Scott C. Lowe , Felix Dangel , Runa Eschenhagen , Zikun Xu , Roger B. Grosse

Bayesian Additive Regression Trees for functional ANOVA model

Bayesian Additive Regression Trees (BART) is a powerful statistical model that leverages the strengths of Bayesian inference and regression trees. It has received significant attention for capturing complex non-linear relationships and…

机器学习 · 统计学 2026-04-01 Seokhun Park , Insung Kong , Yongdai Kim

On computing and the complexity of computing higher-order $U$-statistics, exactly

Higher-order $U$-statistics abound in fields such as statistics, machine learning, and computer science, but are known to be highly time-consuming to compute in practice. Despite their widespread appearance, a comprehensive study of their…

机器学习 · 统计学 2026-04-01 Xingyu Chen , Ruiqi Zhang , Lin Liu

Bayesian Modeling and Estimation of Linear Time-Varying Systems using Neural Networks and Gaussian Processes

The identification of Linear Time-Varying (LTV) systems from input-output data is a fundamental yet challenging ill-posed inverse problem. This work introduces a unified Bayesian framework that models the system's impulse response, $h(t,…

机器学习 · 统计学 2026-04-01 Yaniv Shulman

Extracting Interpretable Models from Tree Ensembles: Computational and Statistical Perspectives

Tree ensembles are non-parametric methods widely recognized for their accuracy and ability to capture complex interactions. While these models excel at prediction, they are difficult to interpret and may fail to uncover useful relationships…

机器学习 · 统计学 2026-04-01 Brian Liu , Rahul Mazumder , Peter Radchenko

DeepRV: Accelerating Spatiotemporal Inference with Pre-trained Neural Priors

Gaussian Processes (GPs) provide a flexible and statistically principled foundation for modelling spatiotemporal phenomena, but their $O(N^3)$ scaling makes them intractable for large datasets. Approximate methods such as variational…

机器学习 · 统计学 2026-04-01 Jhonathan Navott , Daniel Jenson , Seth Flaxman , Elizaveta Semenova

Cheap Bootstrap for Fast Uncertainty Quantification of Stochastic Gradient Descent

Stochastic gradient descent (SGD) or stochastic approximation has been widely used in model training and stochastic optimization. While there is a huge literature on analyzing its convergence, inference on the obtained solutions from SGD…

机器学习 · 统计学 2026-04-01 Henry Lam , Zitong Wang

LDDMM stochastic interpolants: an application to domain uncertainty quantification in hemodynamics

We introduce a novel conditional stochastic interpolant framework for generative modeling of three-dimensional shapes. The method builds on a recent LDDMM-based registration approach to learn the conditional drift between geometries. By…

机器学习 · 统计学 2026-03-31 Sarah Katz , Francesco Romor , Jia-Jie Zhu , Alfonso Caiazzo

Persistence diagrams of random matrices via Morse theory: universality and a new spectral diagnostic

We prove that the persistence diagram of the sublevel set filtration of the quadratic form f(x) = x^T M x restricted to the unit sphere S^{n-1} is analytically determined by the eigenvalues of the symmetric matrix M. By Morse theory, the…

机器学习 · 统计学 2026-03-31 Matthew Loftus

Statistical Guarantees for Distributionally Robust Optimization with Optimal Transport and OT-Regularized Divergences

We study finite-sample statistical performance guarantees for distributionally robust optimization (DRO) with optimal transport (OT) and OT-regularized divergence model neighborhoods. Specifically, we derive concentration inequalities for…

机器学习 · 统计学 2026-03-31 Jeremiah Birrell , Xiaoxi Shen

Energy Score-Guided Neural Gaussian Mixture Model for Predictive Uncertainty Quantification

Quantifying predictive uncertainty is essential for real world machine learning applications, especially in scenarios requiring reliable and interpretable predictions. Many common parametric approaches rely on neural networks to estimate…

机器学习 · 统计学 2026-03-31 Yang Yang , Chunlin Ji , Haoyang Li , Ke Deng

On the Loss Landscape Geometry of Regularized Deep Matrix Factorization: Uniqueness and Sharpness

Weight decay is ubiquitous in training deep neural network architectures. Its empirical success is often attributed to capacity control; nonetheless, our theoretical understanding of its effect on the loss landscape and the set of…

机器学习 · 统计学 2026-03-31 Anil Kamber , Rahul Parhi

Overcoming the Incentive Collapse Paradox

AI-assisted task delegation is increasingly common, yet human effort in such systems is costly and typically unobserved. Recent work by Bastani and Cachon (2025); Sambasivan et al. (2021) shows that accuracy-based payment schemes suffer…

机器学习 · 统计学 2026-03-31 Qichuan Yin , Ziwei Su , Shuangning Li

Parameter Estimation in Stochastic Differential Equations via Wiener Chaos Expansion and Stochastic Gradient Descent

This study addresses the inverse problem of parameter estimation for Stochastic Differential Equations (SDEs) by minimizing a regularized discrepancy functional via Stochastic Gradient Descent (SGD). To achieve computational efficiency, we…

机器学习 · 统计学 2026-03-31 Francisco Delgado-Vences , José Julián Pavón-Español , Arelly Ornelas

Online Statistical Inference of Constant Sample-averaged Q-Learning

Reinforcement learning algorithms have been widely used for decision-making tasks in various domains. However, the performance of these algorithms can be impacted by high variance and instability, particularly in environments with noise or…

机器学习 · 统计学 2026-03-31 Saunak Kumar Panda , Tong Li , Ruiqi Liu , Yisha Xiang

Static and Dynamic Approaches to Computing Barycenters of Probability Measures on Graphs

The optimal transportation problem defines a geometry of probability measures which leads to a definition for weighted averages (barycenters) of measures, finding application in the machine learning and computer vision communities as a…

机器学习 · 统计学 2026-03-31 David Gentile , James M. Murphy

Koopman Operator Identification of Model Parameter Trajectories for Temporal Domain Generalization (KOMET)

Parametric models deployed in non-stationary environments degrade as the underlying data distribution evolves over time (a phenomenon known as temporal domain drift). In the current work, we present KOMET (Koopman Operator identification of…

机器学习 · 统计学 2026-03-31 Randy C. Hoover , Jacob James , Paul May , Kyle Caudle

Statistical Inference for Explainable Boosting Machines

Explainable boosting machines (EBMs) are popular "glass-box" models that learn a set of univariate functions using boosting trees. These achieve explainability through visualizations of each feature's effect. However, unlike linear model…

机器学习 · 统计学 2026-03-31 Haimo Fang , Kevin Tan , Jonathan Pipping-Gamon , Giles Hooker