统计方法学 — Scifaro

Bayesian Sphere-on-Sphere Regression with Optimal Transport Maps

Spherical regression, in which both covariates and responses lie on the sphere, arises in many scientific applications and has attracted considerable methodological attention in recent years. Despite this progress, constructing flexible and…

统计方法学 · 统计学 2026-05-19 Tin Lok James Ng , Kwok-Kun Kwong , Jiakun Liu , Andrew Zammit-Mangion

A subgroup-aware scoring approach to the study of effect modification in observational studies

Effect modification means the size of a treatment effect varies with an observed covariate. Generally speaking, a larger treatment effect with more stable error terms is less sensitive to bias. Thus, we might be able to conclude that a…

统计方法学 · 统计学 2026-05-19 Yijun Fan , Dylan S. Small

High-dimensional partial linear model with trend filtering

Understanding the links between diet, metabolic changes, and health outcomes is a key focus in nutritional science and broader biological research. Analyzing relationships, such as those between ultra-processed food (UPF) intake and…

统计方法学 · 统计学 2026-05-19 Sang Kyu Lee , Erikka Loftfield , Hyokyoung G. Hong , Haolei Weng

Mixture priors for replication studies

Replication of scientific studies is important for assessing the credibility of their results. However, there is no consensus on how to quantify the extent to which a replication study replicates an original result. We propose a novel…

统计方法学 · 统计学 2026-05-19 Roberto Macrì-Demartino , Leonardo Egidi , Leonhard Held , Samuel Pawel

On the extensions of the Chatterjee-Spearman test

Chatterjee (2021) introduced a novel independence test that is rank-based, asymptotically normal and consistent against all alternatives. One limitation of Chatterjee's test is its low statistical power for detecting monotonic…

统计方法学 · 统计学 2026-05-19 Qingyang Zhang

Collective Outlier Detection and Enumeration with Conformalized Closed Testing

This paper develops a flexible distribution-free method for collective outlier detection and enumeration, designed for situations in which the presence of outliers can be detected powerfully even though their precise identification may be…

统计方法学 · 统计学 2026-05-19 Chiara G. Magnani , Matteo Sesia , Aldo Solari

Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

Hidden Markov models (HMMs) are characterized by an unobservable Markov chain and an observable process -- a noisy version of the hidden chain. Decoding the original signal from the noisy observations is one of the main goals in nearly all…

统计方法学 · 统计学 2026-05-19 Alexandre Mösching , Housen Li , Axel Munk

FRESH: Information-Geometric Calibration of Patient-Level Models to Aggregate Evidence

This note introduces FRESH (Fusion of Recent Evidence and Subject Histories), a method for incorporating population-level summary results -- published clinical trials, registry summaries, prior natural-history studies, and peer-reviewed…

统计方法学 · 统计学 2026-05-18 Franklin Fuller , Daniele Bertolini , Samantha Liang , Jason Christopher , Aaron M. Smith

Why Empirical p-Values Are Not Uniform: Reference Samples, Dependence, and PIT Backtesting

Probability integral transforms (PITs) and empirical $p$-values are widely used to assess the calibration of predictive distributions. While exact PIT values are uniformly distributed under correct model specification, practical…

统计方法学 · 统计学 2026-05-18 Jakub Lis

REX-SUB: A Scalable Subsampling Strategy for Modeling Large Spatial Datasets

Recent advances in data collection technologies have led to the emergence of massive spatial datasets, with measurements obtained at millions of spatial locations. Geostatistical models typically employ Gaussian processes (GPs) to capture…

统计方法学 · 统计学 2026-05-18 Nicholas Rios , Ben Seiyon Lee

Statistical Inference for Smoothed Support Vector Machines in High Dimensions: From Offline to Online Data

High-dimensional classification problems often rely on the Lasso-penalized linear Support Vector Machines (SVMs). However, the double non-smoothness induced by the hinge loss and Lasso penalty in this model makes statistical inference…

统计方法学 · 统计学 2026-05-18 Shuya Zhou , Junwen Xia , Jingxiao Zhang

A Model-Agnostic Bootstrap for Macro-Level Claims Reserving Under the Conditioning Principle

The correct inferential object in claims reserving is the conditional predictive distribution $p(R \mid \mathcal{D}, \hat\theta)$, where $\mathcal{D}$ is the observed triangle held fixed. We refer to this as the conditioning principle. All…

统计方法学 · 统计学 2026-05-18 Robin Van Oirbeek , Tim Verdonck

Bayesian Inference for Non-Conjugate Distance Dependent Chinese Restaurant Process Models

The distance dependent Chinese Restaurant Process (ddCRP) provides a flexible prior distribution for clustering observations, incorporating covariate information through pairwise distances and accommodating a rich variety of cluster…

统计方法学 · 统计学 2026-05-18 Joseph Marsh , Theodore Kypraios , Rowland G. Seymour

The Negative Binomial Chain-Ladder: A Full Likelihood Model for Claim Count Reserving

The Chain-Ladder (CL) method remains the dominant macro-level technique for claims reserving in non-life insurance, yet its classical formulation lacks a coherent probabilistic foundation. Existing stochastic extensions-including the Mack…

统计方法学 · 统计学 2026-05-18 Robin Van Oirbeek

Generalized raking and stabilized weights for regression modeling in two-phase samples

In regression models fitted to data from complex survey designs, sampling weights often incorporate non-essential variation, inflating variance estimates. Stabilized weights mitigate this issue by adjusting sampling weights to account for…

统计方法学 · 统计学 2026-05-18 Tong Chen , Joshua Slone , Gustavo Amorim , Pamela A. Shaw , Bryan E. Shepherd , Thomas Lumley

Re-examining and calibrating weighted survival analysis for causal inference

Causal inference with time-to-event outcomes is fundamental in various scientific studies. In a static setup with fitted propensity scores, weighted Kaplan-Meier estimation for survival probabilities and weighted Breslow-Peto estimation for…

统计方法学 · 统计学 2026-05-18 Wenfu Xu , Yi Zhang , Tobias Gerhard , Zhiqiang Tan

Leveraging heterogeneity for identifiability: Bayesian order-based learning of multiple DAGs

We propose a joint order-based scoring framework for causal structure learning of directed acyclic graph (DAG) models under heterogeneous data settings. We show that leveraging heterogeneity improves the accuracy of causal ordering…

统计方法学 · 统计学 2026-05-18 Hyunwoong Chang , Fariha Taskin

Structured Transfer Learning for Survival Risk Stratification in Data-Sparse Clinical Cohorts

Background: Survival prediction models are often less reliable in clinical groups with limited sample sizes or few outcome events. Target-only models may be unstable, whereas models from larger cohorts may transfer poorly when risk-factor…

统计方法学 · 统计学 2026-05-18 Junhan Yu , Yurui Chen , Juan Delgado-SanMartin , Dennis Wang , Hong Pan , Doudou Zhou

Tail postcoloring in long-run variance estimation of time series

Prewhitening is a common approach to deal with strong autocorrelation. In this article, we propose a new approach called tail postcoloring, motivated by it. It uses parametric models to project, or color back, the neglected tail…

统计方法学 · 统计学 2026-05-18 Xu Liu , Kin Wai Chan

Improving the Efficiency of Subgroup Analysis in Randomized Controlled Trials with TMLE

Subgroup analyses within randomized controlled trials are often underpowered due to limited sample sizes. We address this challenge by leveraging trial participants outside the subgroup of interest to augment estimation within the subgroup.…

统计方法学 · 统计学 2026-05-18 Sky Qiu , Nerissa Nance , Rachael Phillips , Jens Tarp , Maya Petersen , Mark van der Laan