机器学习 — Scifaro

The conditional-mean barrier: From deterministic regression to conditional distribution learning

Many problems in computational science and engineering become one-to-many after coarse graining, partial observation, or inverse reconstruction: a resolved state may not determine a unique subgrid forcing, a structural descriptor may not…

机器学习 · 统计学 2026-05-28 Junfeng Chen

Deep Neural Network Training as Random Effects: An Optimization-Inference Duality

Deep neural networks (DNNs) have achieved remarkable empirical success, yet their training dynamics remain understood mainly from optimization rather than statistical principles. Here we develop a statistical framework for DNN training in…

机器学习 · 统计学 2026-05-28 Minhao Yao , Ruoyu Wang , Xihong Lin , Lin Liu , Zhonghua Liu

Is Backpropagation Optimal? When Synthetic Gradients Improve Sample Efficiency

Backpropagation is the default learning rule for artificial neural networks and is often treated as the settled approach whenever differentiability is available. In this work, we revisit this convention through a theoretical lens of sample…

机器学习 · 统计学 2026-05-28 Yibo Jacky Zhang , Zeyu Tang , Sanmi Koyejo

Learning to target with network interference

This paper studies adaptive targeting under network interference in a bandit setting, where treatments applied to one individual may affect others through spillover effects. We consider a linear model in a sparse regime, where each…

机器学习 · 统计学 2026-05-28 Xiaomeng Wang , Hamsa Bastani , Osbert Bastani , Zhimei Ren

Soft Specialists: $\alpha$-R\'enyi Ensembles for Uncertainty-Aware LLM Post-Training

Existing training approaches for large language models learn a single set of parameters, based on large volumes of data, which is typically heterogeneous, conflicting and often outright contradictory. As a result, the model is forced to…

机器学习 · 统计学 2026-05-28 Paula Cordero-Encinar , Georgy Tyukin , Andrew B. Duncan

Unsupervised Identification and Removal of Spurious Correlations During Fine-Tuning

Fine-tuning a pretrained language model on a curated dataset can produce spurious correlations between the fine-tuning task and unintended latent factors -- such as misaligned personas or political slant -- that the curation procedure has…

机器学习 · 统计学 2026-05-28 Ciarán M. Gilligan-Lee , Joseph Egan , Yuchen Zhu , Michael O'Riordan

Evolving and Detecting Multi-Turn Deception using Geometric Signatures

Safety defenses for large language models (LLMs) are typically trained and evaluated on single-turn prompts, yet real attacks often unfold as indirect, multi-turn probing. To defend against this more nuanced form of deception, we present a…

机器学习 · 统计学 2026-05-28 Surender Suresh Kumar , Mary L. Cummings

Accelerating Reinforcement Learning Training Using Simulation Surrogate Models

High-fidelity simulation models are widely used to analyze complex stochastic systems, but their high computational cost motivates the development of cheaper surrogate models that approximate the simulation model's input-output…

机器学习 · 统计学 2026-05-28 Mohammadmahdi Ghasemloo , David J. Eckman , Yaxian Li

Semiparametrically Efficient Inference for Kernel Measures of Noise Heterogeneity

We develop semiparametrically efficient inference for kernel measures of noise heterogeneity in additive noise models. In many applications, the regression function is estimated using flexible machine learning methods. Downstream procedures…

机器学习 · 统计学 2026-05-28 Jakub Wornbard , Zikai Shen , Dimitri Meunier , Arthur Gretton

Identifiable Bayesian Deep Generative Copulas with Unknown Layer Widths for Data with Arbitrary Marginal Distributions

Deep generative models offer powerful tools for multivariate data analysis, but their black-box architectures are often unidentified and difficult to interpret. We introduce the Deep Discrete Encoder (DDE) Copula, an identifiable and…

机器学习 · 统计学 2026-05-28 Joseph Feldman , Yuqi Gu

Iterative Causal Discovery: Per-Edge Impossibility Certificates, Tier-Aware Oracle Queries, and the $1+K$ Lower Bound

Causal-discovery algorithms return a directed graph, yet provide no principled means of distinguishing edge directions identified by the data from those assigned without an identifying assumption. Under the standard Markov and faithfulness…

机器学习 · 统计学 2026-05-28 Eichi Uehara

Calibrated Inference for the Conditional Average Treatment Effect in the Few-Placebo Regime via Gaussian Processes

Estimating how much an intervention helps a given individual the conditional average treatment effect (CATE) is increasingly central to decision-making in medicine, economics, and policy, where an estimate is most useful when accompanied by…

机器学习 · 统计学 2026-05-28 Eichi Uehara

Rao-Blackwellized Score Matching on Manifolds

We study denoising score matching (DSM) when the latent distribution is supported on a smooth embedded manifold $M \subset \mathbb{R}^D$. Under ambient Gaussian corruption, the tangent denoising target contains a singular normal-fiber noise…

机器学习 · 统计学 2026-05-28 Divit Rawal

Federated Language Models Under Bandwidth Budgets: Distillation Rates and Conformal Coverage

Training a language model on data scattered across bandwidth-limited nodes that cannot be centralized is a setting that arises in clinical networks, enterprise knowledge bases, and scientific consortia. We study the regime in which data…

机器学习 · 统计学 2026-05-28 Prasanjit Dubey , Xiaoming Huo

No Certificate for Alignment: Two Independent Impossibilities and the Pareto Frontier of Achievable Safety Guarantees

We argue that formal certification of AI alignment over open-ended or unbounded input domains is impossible under standard assumptions in computational complexity and learning theory, and characterise what remains achievable. Two…

机器学习 · 统计学 2026-05-28 Ayushi Agarwal

Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

Heteroscedasticity -- where the variance of a variable changes with other variables -- is pervasive in real data, and elucidating why it arises from the perspective of statistical moments is crucial in scientific knowledge discovery and…

机器学习 · 统计学 2026-05-28 Yoichi Chikahara

The Well-Tempered Classifier: Some Elementary Properties of Temperature Scaling

Temperature scaling is a simple method that allows to control the uncertainty of probabilistic models. It is mostly used in two contexts: improving the calibration of classifiers and tuning the stochasticity of large language models (LLMs).…

机器学习 · 统计学 2026-05-28 Pierre-Alexandre Mattei , Bruno Loureiro

Corrected Samplers for Discrete Flow Models

Discrete flow models (DFMs) have been proposed to learn the data distribution on finite state space, offering a flexible framework as an alternative to discrete diffusion models. A line of recent work has studied samplers for discrete…

机器学习 · 统计学 2026-05-28 Zhengyan Wan , Yidong Ouyang , Liyan Xie , Hongyuan Zha , Fang Fang , Guang Cheng

DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

Data assimilation (DA) is a cornerstone of scientific and engineering applications, combining model forecasts with sparse and noisy observations to estimate latent system states. Classical high-dimensional DA methods, such as the ensemble…

机器学习 · 统计学 2026-05-28 Martin Andrae , Erik Wikingsson , So Takao , Tomas Landelius , Fredrik Lindsten

Linear Causal Representation Learning by Topological Ordering, Pruning, and Disentanglement

Causal representation learning (CRL) has garnered increasing interest from the causal inference and artificial intelligence communities due to its potential to disentangle complex data-generating mechanism into causally interpretable latent…

机器学习 · 统计学 2026-05-28 Hao Chen , Lin Liu , Yu Guang Wang