机器学习 — Scifaro

Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models

Despite a plethora of anomaly detection models developed over the years, their ability to generalize to unseen anomalies remains an issue, particularly in critical systems. This paper aims to address this challenge by introducing Swift…

机器学习 · 统计学 2025-03-26 Nguyen Do , Truc Nguyen , Malik Hassanaly , Raed Alharbi , Jung Taek Seo , My T. Thai

Nonparametric estimation of Hawkes processes with RKHSs

This paper addresses nonparametric estimation of nonlinear multivariate Hawkes processes, where the interaction functions are assumed to lie in a reproducing kernel Hilbert space (RKHS). Motivated by applications in neuroscience, the model…

机器学习 · 统计学 2025-03-26 Anna Bonnet , Maxime Sangnier

Federated Causal Inference: Multi-Study ATE Estimation beyond Meta-Analysis

We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers. We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula, ranging from…

机器学习 · 统计学 2025-03-26 Rémi Khellaf , Aurélien Bellet , Julie Josse

Stochastic neighborhood embedding and the gradient flow of relative entropy

Dimension reduction, widely used in science, maps high-dimensional data into low-dimensional space. We investigate a basic mathematical model underlying the techniques of stochastic neighborhood embedding (SNE) and its popular variant…

机器学习 · 统计学 2025-03-26 Ben Weinkove

Targeted Separation and Convergence with Kernel Discrepancies

Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each…

机器学习 · 统计学 2025-03-26 Alessandro Barp , Carl-Johann Simon-Gabriel , Mark Girolami , Lester Mackey

Learning a Class of Mixed Linear Regressions: Global Convergence under General Data Conditions

Mixed linear regression (MLR) has attracted increasing attention because of its great theoretical and practical importance in capturing nonlinear relationships by utilizing a mixture of linear regression sub-models. Although considerable…

机器学习 · 统计学 2025-03-25 Yujing Liu , Zhixin Liu , Lei Guo

A New Stochastic Approximation Method for Gradient-based Simulated Parameter Estimation

This paper tackles the challenge of parameter calibration in stochastic models, particularly in scenarios where the likelihood function is unavailable in an analytical form. We introduce a gradient-based simulated parameter estimation…

机器学习 · 统计学 2025-03-25 Zehao Li , Yijie Peng

Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems

The reconstruction of tensor-valued signals from corrupted measurements, known as tensor regression, has become essential in many multi-modal applications such as hyperspectral image reconstruction and medical imaging. In this work, we…

机器学习 · 统计学 2025-03-25 Alejandra Castillo , Jamie Haddock , Iryna Hartsock , Paulina Hoyos , Lara Kassab , Alona Kryshchenko , Kamila Larripa , Deanna Needell , Shambhavi Suryanarayanan , Karamatou Yacoubou Djima

Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality

The goal of the Inverse reinforcement learning (IRL) task is to identify the underlying reward function and the corresponding optimal policy from a set of expert demonstrations. While most IRL algorithms' theoretical guarantees rely on a…

机器学习 · 统计学 2025-03-25 Ruijia Zhang , Siliang Zeng , Chenliang Li , Alfredo Garcia , Mingyi Hong

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

We consider synthesis and analysis of probability measures using the entropy-regularized Wasserstein-2 cost and its unbiased version, the Sinkhorn divergence. The synthesis problem consists of computing the barycenter, with respect to these…

机器学习 · 统计学 2025-03-25 Brendan Mallery , James M. Murphy , Shuchin Aeron

Score matching through the roof: linear, nonlinear, and latent variables causal discovery

Causal discovery from observational data holds great promise, but existing methods rely on strong assumptions about the underlying causal structure, often requiring full observability of all relevant variables. We tackle these challenges by…

机器学习 · 统计学 2025-03-25 Francesco Montagna , Philipp M. Faller , Patrick Bloebaum , Elke Kirschbaum , Francesco Locatello

Consistent Validation for Predictive Methods in Spatial Settings

Spatial prediction tasks are key to weather forecasting, studying air pollution impacts, and other scientific endeavors. Determining how much to trust predictions made by statistical or physical methods is essential for the credibility of…

机器学习 · 统计学 2025-03-25 David R. Burt , Yunyi Shen , Tamara Broderick

Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation

Empirical risk minimization often performs poorly when the distribution of the target domain differs from those of source domains. To address such potential distribution shifts, we develop an unsupervised domain adaptation approach that…

机器学习 · 统计学 2025-03-25 Zhenyu Wang , Peter Bühlmann , Zijian Guo

Optimal Approximation of Zonoids and Uniform Approximation by Shallow Neural Networks

We study the following two related problems. The first is to determine to what error an arbitrary zonoid in $\mathbb{R}^{d+1}$ can be approximated in the Hausdorff distance by a sum of $n$ line segments. The second is to determine optimal…

机器学习 · 统计学 2025-03-25 Jonathan W. Siegel

On Neural Networks as Infinite Tree-Structured Probabilistic Graphical Models

Deep neural networks (DNNs) lack the precise semantics and definitive probabilistic interpretation of probabilistic graphical models (PGMs). In this paper, we propose an innovative solution by constructing infinite tree-structured PGMs that…

机器学习 · 统计学 2025-03-25 Boyao Li , Alexander J. Thomson , Houssam Nassif , Matthew M. Engelhard , David Page

Generative adversarial framework to calibrate excursion set models for the 3D morphology of all-solid-state battery cathodes

This paper presents a computational method for generating virtual 3D morphologies of functional materials using low-parametric stochastic geometry models, i.e., digital twins, calibrated with 2D microscopy images. These digital twins allow…

机器学习 · 统计学 2025-03-24 Orkun Furat , Sabrina Weber , Johannes Schubert , René Rekers , Maximilian Luczak , Erik Glatt , Andreas Wiegmann , Jürgen Janek , Anja Bielefeld , Volker Schmidt

Online Selective Conformal Prediction: Errors and Solutions

In online selective conformal inference, data arrives sequentially, and prediction intervals are constructed only when an online selection rule is met. Since online selections may break the exchangeability between the selected test datum…

机器学习 · 统计学 2025-03-24 Yusuf Sale , Aaditya Ramdas

EarlyStopping: Implicit Regularization for Iterative Learning Procedures in Python

Iterative learning procedures are ubiquitous in machine learning and modern statistics. Regularision is typically required to prevent inflating the expected loss of a procedure in later iterations via the propagation of noise inherent in…

机器学习 · 统计学 2025-03-24 Eric Ziebell , Ratmir Miftachov , Bernhard Stankewitz , Laura Hucker

Procrustes Wasserstein Metric: A Modified Benamou-Brenier Approach with Applications to Latent Gaussian Distributions

We introduce a modified Benamou-Brenier type approach leading to a Wasserstein type distance that allows global invariance, specifically, isometries, and we show that the problem can be summarized to orthogonal transformations. This…

机器学习 · 统计学 2025-03-24 Kevine Meugang Toukam

SNPL: Simultaneous Policy Learning and Evaluation for Safe Multi-Objective Policy Improvement

To design effective digital interventions, experimenters face the challenge of learning decision policies that balance multiple objectives using offline data. Often, they aim to develop policies that maximize goal outcomes, while ensuring…

机器学习 · 统计学 2025-03-24 Brian Cho , Ana-Roxana Pop , Ariel Evnine , Nathan Kallus