机器学习 — Scifaro

Auto-differentiable data assimilation: Co-learning of states, dynamics, and filtering algorithms

Data assimilation algorithms estimate the state of a dynamical system from partial observations, where the successful performance of these algorithms hinges on costly parameter tuning and on employing an accurate model for the dynamics.…

机器学习 · 统计学 2026-03-24 Melissa Adrian , Daniel Sanz-Alonso , Rebecca Willett

High-dimensional online learning via asynchronous decomposition: Non-divergent results, dynamic regularization, and beyond

Existing high-dimensional online learning methods often face the challenge that their error bounds, or per-batch sample sizes, diverge as the number of data batches increases. To address this issue, we propose an asynchronous decomposition…

机器学习 · 统计学 2026-03-24 Shixiang Liu , Zhifan Li , Hanming Yang , Jianxin Yin

Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics

We propose a dense associative memory for empirical measures (weighted point clouds). Stored patterns and queries are finitely supported probability measures, and retrieval is defined by minimizing a Hopfield-style log-sum-exp energy built…

机器学习 · 统计学 2026-03-24 Aratrika Mustafi , Soumya Mukherjee

LassoFlexNet: Flexible Neural Architecture for Tabular Data

Despite their dominance in vision and language, deep neural networks often underperform relative to tree-based models on tabular data. To bridge this gap, we incorporate five key inductive biases into deep learning: robustness to irrelevant…

机器学习 · 统计学 2026-03-24 Kry Yik Chau Lui , Cheng Chi , Kishore Basu , Yanshuai Cao

Interpretable Operator Learning for Inverse Problems via Adaptive Spectral Filtering: Convergence and Discretization Invariance

Solving ill-posed inverse problems necessitates effective regularization strategies to stabilize the inversion process against measurement noise. While classical methods like Tikhonov regularization require heuristic parameter tuning, and…

机器学习 · 统计学 2026-03-24 Hang-Cheng Dong , Pengcheng Cheng , Shuhuan Li

CogFormer: Learn All Your Models Once

Simulation-based inference (SBI) with neural networks has accelerated and transformed cognitive modeling workflows. SBI enables modelers to fit complex models that were previously difficult or impossible to estimate, while also allowing…

机器学习 · 统计学 2026-03-24 Jerry M. Huang , Lukas Schumacher , Niek Stevenson , Stefan T. Radev

Comprehensive Description of Uncertainty in Measurement for Representation and Propagation with Scalable Precision

Probability theory has become the predominant framework for quantifying uncertainty across scientific and engineering disciplines, with a particular focus on measurement and control systems. However, the widespread reliance on simple…

机器学习 · 统计学 2026-03-24 Ali Darijani , Jürgen Beyerer , Zahra Sadat Hajseyed Nasrollah , Luisa Hoffmann , Michael Heizmann

Pseudo-Labeling for Unsupervised Domain Adaptation with Kernel GLMs

We propose a principled framework for unsupervised domain adaptation under covariate shift in kernel Generalized Linear Models (GLMs), encompassing kernelized linear, logistic, and Poisson regression with ridge regularization. Our goal is…

机器学习 · 统计学 2026-03-24 Nathan Weill , Kaizheng Wang

Multi-Domain Empirical Bayes for Linearly-Mixed Causal Representations

Causal representation learning (CRL) aims to learn low-dimensional causal latent variables from high-dimensional observations. While identifiability has been extensively studied for CRL, estimation has been less explored. In this paper, we…

机器学习 · 统计学 2026-03-24 Bohan Wu , Julius von Kügelgen , David M. Blei

Power-SMC: Low-Latency Sequence-Level Power Sampling for Training-Free LLM Reasoning

Many recent reasoning gains in large language models can be explained as distribution sharpening: biasing generation toward high-likelihood trajectories already supported by the pretrained model, rather than modifying its weights. A natural…

机器学习 · 统计学 2026-03-24 Seyedarmin Azizi , Erfan Baghaei Potraghloo , Minoo Ahmadi , Souvik Kundu , Massoud Pedram

BioBO: Biology-informed Bayesian Optimization for Perturbation Design

Efficient design of genomic perturbation experiments is crucial for accelerating drug discovery and therapeutic target identification, yet exhaustive perturbation of the human genome remains infeasible due to the vast search space of…

机器学习 · 统计学 2026-03-24 Yanke Li , Tianyu Cui , Tommaso Mansi , Mangal Prakash , Rui Liao

Proximal Point Nash Learning from Human Feedback

Traditional Reinforcement Learning from Human Feedback (RLHF) often relies on reward models, frequently assuming preference structures like the Bradley--Terry model, which may not accurately capture the complexities of real human…

机器学习 · 统计学 2026-03-24 Daniil Tiapkin , Daniele Calandriello , Denis Belomestny , Eric Moulines , Alexey Naumov , Kashif Rasul , Michal Valko , Pierre Menard

A Computational Transition for Detecting Multivariate Shuffled Linear Regression by Low-Degree Polynomials

In this paper, we study the problem of multivariate shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation. Specifically, we investigate the model…

机器学习 · 统计学 2026-03-24 Zhangsong Li

Tightening optimality gap with confidence through conformal prediction

Decision makers routinely use constrained optimization technology to plan and operate complex systems like global supply chains or power grids. In this context, practitioners must assess how close a computed solution is to optimality in…

机器学习 · 统计学 2026-03-24 Miao Li , Michael Klamkin , Russell Bent , Pascal Van Hentenryck

Scalable Learning from Probability Measures with Mean Measure Quantization

We consider statistical learning problems in which data are observed as a set of probability measures. Optimal transport (OT) is a popular tool to compare and manipulate such objects, but its computational cost becomes prohibitive when the…

机器学习 · 统计学 2026-03-24 Erell Gachon , Elsa Cazelles , Jérémie Bigot

A Training-free Method for LLM Text Attribution

Verifying the provenance of content is crucial to the functioning of many organizations, e.g., educational institutions, social media platforms, and firms. This problem is becoming increasingly challenging as text generated by Large…

机器学习 · 统计学 2026-03-24 Tara Radvand , Mojtaba Abdolmaleki , Mohamed Mostagir , Ambuj Tewari

Fast convergence of a Federated Expectation-Maximization Algorithm

Data heterogeneity has been a long-standing bottleneck in studying the convergence rates of Federated Learning algorithms. In order to better understand the issue of data heterogeneity, we study the convergence rate of the…

机器学习 · 统计学 2026-03-24 Zhixu Tao , Rajita Chandak , Sanjeev Kulkarni

Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel

Modeling multi-agent systems on networks is a fundamental challenge in a wide variety of disciplines. Given data consisting of multiple trajectories, we jointly infer the (weighted) network and the interaction kernel, which determine,…

机器学习 · 统计学 2026-03-24 Quanjun Lang , Xiong Wang , Fei Lu , Mauro Maggioni

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with…

机器学习 · 统计学 2026-03-24 Wanrong Zhu , Zhipeng Lou , Ziyang Wei , Wei Biao Wu

The surrogate Gibbs-posterior of a corrected stochastic MALA: Towards uncertainty quantification for neural networks

MALA is a popular gradient-based Markov chain Monte Carlo method to access the Gibbs-posterior distribution. Stochastic MALA (sMALA) scales to large data sets, but changes the target distribution from the Gibbs-posterior to a surrogate…

机器学习 · 统计学 2026-03-24 Sebastian Bieringer , Gregor Kasieczka , Maximilian F. Steffen , Mathias Trabs