Statistics — Scifaro

Distributionally Robust Transfer Learning with Structurally Missing Covariates, with Application to Cross-National Cardiac Arrest Prediction

Deploying clinical prediction models across healthcare systems often fails when key training covariates are unavailable at deployment and labeled outcomes are limited in the target domain. For example, high-performing models for…

Applications · Statistics 2026-05-26 Siqi Li , Chuan Hong , Ziye Tian , Benjamin Sieu-Hon Leong , Koshi Nakagawa , Hideharu Tanaka , Sang Do Shin , Khuong Quoc Dai , Do Ngoc Son , Marcus Eng Hock Ong , Nan Liu , Molei Liu

Post-Processing Posterior Predictive P-values

This article addresses issues of model criticism and model comparison in Bayesian contexts, and focusses on the use of the so-called posterior predictive p-values (ppp values). These involve a general discrepancy or conflict measure and…

Methodology · Statistics 2026-05-26 Nils Lid Hjort , Fredrik A. Dahl , Gunnhildur Högnadóttir Steinbakk

Modified treatment policies that depend on the natural history of treatment

Longitudinal modified treatment policies (LMTP) are a class of interventions that allow the definition, identification, and estimation of causal effects in general settings, such as with continuous or multivariate exposures, treatment…

Methodology · Statistics 2026-05-26 Iván Díaz , Nicholas T. Williams , Paweł Morzywołek , Kara E. Rudolph

Detecting Metastable Basins in High Dimensions via Marginal Trajectory Distribution Discrimination

We study the problem of identifying dynamically distinct basins of attraction in high dimensional time-homogeneous Markov processes using only trajectory sampling. This problem is fundamental in the analysis of metastable dynamical systems,…

Machine Learning · Statistics 2026-05-26 Taj Jones-McCormick

Heritability: A Counterfactual Perspective

Heritability is a central concept in the long-standing debate about nature versus nurture in biological and social sciences. However, existing notions of heritability are based on strong assumptions and do not use explicit causal models. We…

Applications · Statistics 2026-05-26 Haochen Lei , Jieru Shi , Hongyuan Cao , Qingyuan Zhao

PCA score regression: the art of losing power

The regression of principal component scores (RPCS) on covariates is a widely used analytic approach to detect and test for associations between functional measurements and study participant characteristics. Here we show that: (1) RPCS…

Methodology · Statistics 2026-05-26 Yu Lu , Nidhi Pai , Erjia Cui , Ciprian Crainiceanu

Causality as the Statistical Conscience of Artificial Intelligence: From Pearl's Ladder to Trustworthy Machines

Modern Artificial Intelligence achieves remarkable predictive power by optimizing statistical risk functionals over vast corpora. Yet a gap separates this from genuine intelligence: the inability to distinguish correlation from causation.…

Machine Learning · Statistics 2026-05-26 Ernest Fokoué

Optimal Non-Asymptotic Edgeworth Expansions for Multivariate Neural Network Outputs

Finite-width fully connected neural networks with Gaussian-initialized weights deviate from their infinite-width Gaussian limit, exhibiting non-vanishing higher-order cumulants. We approximate these deviations, for a neural network…

Machine Learning · Statistics 2026-05-26 Lucia Celli

Convergence and non-asymptotic error analysis for kinetic Langevin samplers using the exact harmonic Langevin integrator

We propose a novel kinetic Langevin sampler based on a specific splitting scheme using the exact harmonic Langevin integrator. For strongly log-concave target measures, the sampler exploits a decomposition of the strongly convex potential…

Computation · Statistics 2026-05-26 Katharina Schuh

Possession-Level Player Impact in the Pre-Play-by-Play NBA Era: A Video-Reconstructed RAPM Database, 1984--1996

Regularized Adjusted Plus-Minus (RAPM) is the standard framework for estimating individual player impact in basketball. Its application requires possession-level stint data -- records of which five players shared the court for each…

Applications · Statistics 2026-05-26 Justin Jacobs

Learning Kernel-Based MDPs from Episodic Preferential Feedback

Human feedback often arrives as preferences rather than calibrated numeric rewards, motivating reinforcement learning from preferential feedback, also referred to as reinforcement learning from human feedback (RLHF). We present a rigorous…

Machine Learning · Statistics 2026-05-26 Nikola Pavlovic , Sattar Vakili , Qing Zhao

KAPLAN: Kolmogorov-Arnold Prognostic Learnable Activation Networks for Survival Analysis

Survival analysis aims to model how covariates and time jointly shape the time-to-event distribution under right censoring. Classical methods such as the Cox model and generalised additive models (GAMs) require interactions and time-varying…

Machine Learning · Statistics 2026-05-26 Stelios Boulitsakis Logothetis , Angela Wood , Pietro Liò

Finite-Particle Convergence Rates for Conservative and Non-Conservative Drifting Models

We propose and analyze a conservative drifting method for one-step generative modeling. The method replaces the original displacement-based drifting velocity by a kernel density estimator (KDE)-gradient velocity, namely the difference of…

Machine Learning · Statistics 2026-05-26 Krishnakumar Balasubramanian

Variance-Reduced Manifold Sampling via Polynomial-Maximization Density Estimation

Uniform sampling on implicitly defined manifolds is a core primitive in motion planning, constrained simulation, and probabilistic machine learning. MASEM addresses this problem by entropy-maximizing resampling, but its resampling weights…

Methodology · Statistics 2026-05-26 Serhii Zabolotnii

Reducing Diffusion Model Memorization with Higher Order Langevin Dynamics

Diffusion/score-based models have emerged as powerful generative models, capable of generating high-quality samples that mimic the training data distribution. However, it has been observed that they are prone to reproducing training…

Machine Learning · Statistics 2026-05-26 Benjamin Sterling , Mónica F. Bugallo , Tom Tirer

SURGE: Approximation and Training Free Particle Filter for Diffusion Surrogate

Data assimilation (DA) addresses the problem of sequentially estimating the state of a dynamical system from noisy and incomplete observations. In this work, we employ a diffusion model as a world model to simulate and predict the system's…

Machine Learning · Statistics 2026-05-26 Lifu Wei , Yinuo Ren , Naichen Shi , Yiping Lu

Polynomial Maximization Method with Fractional Polynomial Basis: A Frequentist Bridge to Bayesian Fractional Polynomials

Fractional polynomials are widely used for dose-response modelling, and recent Bayesian fractional polynomial work has renewed interest in this finite model class. We propose PMM-FP, a frequentist extension of Kunchenko's polynomial…

Methodology · Statistics 2026-05-26 Serhii Zabolotnii

Keeping Score: Efficiency Improvements in Neural Likelihood Surrogate Training via Score-Augmented Loss Functions

For stochastic process models, parameter inference is often severely bottlenecked by computationally expensive likelihood functions. Simulation-based inference (SBI) bypasses this restriction by constructing amortized surrogate likelihoods,…

Machine Learning · Statistics 2026-05-26 Alexander Shen , Mikael Kuusela

Learning Preferences from Conjoint Data: A Structural Deep Learning Approach

Conjoint experiments randomize multidimensional profiles, offering a powerful design for recovering structural preference parameters -- including marginal rates of substitution, willingness to pay, and the distribution of preferences across…

Methodology · Statistics 2026-05-26 Avidit Acharya , Jens Hainmueller , Yiqing Xu

Estimating Dynamic Marginal Policy Effects under Sequential Unconfoundedness

We develop methods for estimating how infinitesimal policy changes affect long-term outcomes in dynamic systems. We show that dynamic marginal policy effects (MPEs) can be identified via tractable reduced-form expressions, and can be…

Methodology · Statistics 2026-05-26 I-han Lai , Stefan Wager