机器学习 — Scifaro

Spatial Adapter: Structured Spatial Decomposition and Closed-Form Covariance for Frozen Predictors

We present the Spatial Adapter, a parameter-efficient post-hoc layer that equips any frozen first-stage predictor with a structured spatial representation of its residual field and an induced closed-form spatial covariance. The adapter…

机器学习 · 统计学 2026-05-13 Wen-Ting Wang , Wei-Ying Wu , Hao-Yun Huang , Xuan-Chun Wang

Adaptive Policy Learning Under Unknown Network Interference

Adaptive experimentation under unknown network interference requires solving two coupled problems: (i) learning the underlying dynamics of interference among units and (ii) using these dynamics to inform treatment allocation in order to…

机器学习 · 统计学 2026-05-13 Aidan Gleich , Eric Laber , Alexander Volfovsky

Interpretable Machine Learning for Spatial Science: A Lie-Algebraic Kernel for Rotationally Anisotropic Gaussian Processes

Many three-dimensional spatial fields are anisotropic, with directions of rapid and slow variation that need not align with the coordinate axes. Standard Gaussian process kernels with Automatic Relevance Determination (ARD) capture only…

机器学习 · 统计学 2026-05-13 Kane Warrior , Dalia Chakrabarty

Uniform Scaling Limits in AdamW-Trained Transformers

We study the large-depth limit of transformers trained with AdamW, by modelling the hidden-state dynamics as an interacting particle system (IPS) coupled through the attention mechanism. Under appropriate scaling of the attention heads, we…

机器学习 · 统计学 2026-05-13 William Gibson , Christoph Reisinger

Sharp feature-learning transitions and Bayes-optimal neural scaling laws in extensive-width networks

We study the information-theoretic limits of learning a one-hidden-layer teacher network with hierarchical features from noisy queries, in the context of knowledge transfer to a smaller student model. We work in the high-dimensional regime…

机器学习 · 统计学 2026-05-13 Minh-Toan Nguyen , Jean Barbier

One Operator for Many Densities: Amortized Approximation of Conditioning by Neural Operators

Probabilistic conditioning is concerned with the identification of a distribution of a random variable $X$ given a random variable $Y$. It is a cornerstone of scientific and engineering applications where modeling uncertainty is key. This…

机器学习 · 统计学 2026-05-13 Panos Tsimpos , Edoardo Calvello , Ayoub Belhadji , Nicholas H. Nelsen

Identifiability and Stability of Generative Drifting with Companion-Elliptic Kernel Families

This paper studies the identifiability and stability of drifting fields within the framework of Generative Modeling via Drifting. The motivating question is whether a zero-drift equilibrium identifies the target distribution, and whether an…

机器学习 · 统计学 2026-05-13 HakGeun Lee , Hyonho Chun

Explicit integral representations and quantitative bounds for two-layer ReLU networks

An approach to construct explicit integral representations for two-layer ReLU networks is presented, which provides relatively simple representations for any multivariate polynomial. Quantitative bounds are provided for a particular,…

机器学习 · 统计学 2026-05-13 Anthony Lee

Doubly Outlier-Robust Online Infinite Hidden Markov Model

We derive a robust update rule for the online infinite hidden Markov model (iHMM) for when the streaming data contains outliers and the model is misspecified. Leveraging recent advances in generalised Bayesian inference, we define…

机器学习 · 统计学 2026-05-13 Horace Yiu , Leandro Sánchez-Betancourt , Álvaro Cartea , Gerardo Duran-Martin

Maximin Robust Bayesian Experimental Design

We address the brittleness of Bayesian experimental design under model misspecification by formulating the problem as a max--min game between the experimenter and an adversarial nature subject to information-theoretic constraints. We…

机器学习 · 统计学 2026-05-13 Hany Abdulsamad , Sahel Iqbal , Christian A. Naesseth , Takuo Matsubara , Adrien Corenflos

Targeted Synthetic Control Method

The synthetic control method (SCM) estimates causal effects in panel data with a single-treated unit by constructing a counterfactual outcome as a weighted combination of untreated control units that matches the pre-treatment trajectory. In…

机器学习 · 统计学 2026-05-13 Yuxin Wang , Dennis Frauen , Emil Javurek , Konstantin Hess , Yuchen Ma , Stefan Feuerriegel

Provably Data-driven Multiple Hyper-parameter Tuning with Structured Loss Function

Data-driven algorithm design automates hyperparameter tuning, but its statistical foundations remain limited because model performance can depend on hyperparameters in implicit and highly non-smooth ways. Existing guarantees focus on the…

机器学习 · 统计学 2026-05-13 Tung Quoc Le , Anh Tuan Nguyen , Viet Anh Nguyen

Sparse Offline Reinforcement Learning with Corruption Robustness

We investigate robustness to strong data corruption in offline sparse reinforcement learning (RL). In our setting, an adversary may arbitrarily perturb a fraction of the collected trajectories from a high-dimensional but sparse Markov…

机器学习 · 统计学 2026-05-13 Nam Phuong Tran , Andi Nika , Goran Radanovic , Long Tran-Thanh , Debmalya Mandal

Arbitrated Indirect Treatment Comparisons

Matching-adjusted indirect comparison (MAIC) has been increasingly employed in health technology assessments (HTA). By reweighting subjects from a trial with individual participant data (IPD) to match the covariate summary statistics of…

机器学习 · 统计学 2026-05-13 Yixin Fang , Weili He

Learning density ratios in causal inference using Bregman-Riesz regression

The ratio of two probability density functions is a fundamental quantity that appears in many areas of statistics and machine learning, including causal inference, reinforcement learning, covariate shift, outlier detection, independence…

机器学习 · 统计学 2026-05-13 Oliver J. Hines , Caleb H. Miles

Multi-modal Bayesian Neural Network Surrogates with Conjugate Last-Layer Estimation

As data collection and simulation capabilities advance, multi-modal learning, the task of learning from multiple modalities and sources of data, is becoming an increasingly important area of research. Surrogate models that learn from data…

机器学习 · 统计学 2026-05-13 Ian Taylor , Juliane Mueller , Julie Bessac

Semi-Supervised Bayesian GANs with Log-Signatures for Uncertainty-Aware Credit Card Fraud Detection

We present a novel deep generative semi-supervised framework for credit card fraud detection, formulated as time series classification task. As financial transaction data streams grow in scale and complexity, traditional methods often…

机器学习 · 统计学 2026-05-13 David Hirnschall

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Amortized Bayesian model comparison (BMC) enables fast probabilistic ranking of models via simulation-based training of neural surrogates. However, the accuracy of neural surrogates deteriorates when simulation models are misspecified; the…

机器学习 · 统计学 2026-05-13 Šimon Kucharský , Aayush Mishra , Daniel Habermann , Stefan T. Radev , Paul-Christian Bürkner

Sequential Off-Policy Learning with Logarithmic Smoothing

Off-policy learning enables training policies from logged interaction data. Most prior work considers the batch setting, where a policy is learned from data generated by a single behavior policy. In real systems, however, policies are…

机器学习 · 统计学 2026-05-13 Maxime Haddouche , Otmane Sakhi

Stationary MMD Points

Approximation of a target probability distribution using a finite set of points is a problem of fundamental importance in numerical integration. Several authors have proposed to select points by minimising a maximum mean discrepancy (MMD),…

机器学习 · 统计学 2026-05-13 Zonghao Chen , Toni Karvonen , Heishiro Kanagawa , François-Xavier Briol , Chris. J. Oates