统计方法学 — Scifaro

Data Fusion with Distributional Equivalence Test-then-pool

Randomized controlled trials (RCTs) are the gold standard for causal inference, yet practical constraints often limit the size of the concurrent control arm. Borrowing control data from previous trials offers a potential efficiency gain,…

统计方法学 · 统计学 2026-03-17 Linying Yang , Xing Liu , Robin J. Evans

Clustering-Based Outcome Models for Clinical Studies: A Scoping Review

This review provides a systematic overview of methods that combine covariate-based clustering of observational units (patients) with outcome models for clinical studies. We distinguish between informed-cluster models, where the outcome…

统计方法学 · 统计学 2026-03-17 Johannes Vilsmeier , Fabian Eibensteiner , Franz König , Francois Mercier , Robin Ristl , Nigel Stallard , Marc Vandemeulebroecke , Sarah Zohar , Martin Posch

Leave-One-Out Neighborhood Smoothing for Graphons: Berry-Esseen Bounds, Confidence Intervals, and Honest Tuning

Neighborhood smoothing methods achieve minimax-optimal rates for estimating edge probabilities under graphon models, but their use for statistical inference has remained limited. The main obstacle is that classical neighborhood smoothers…

统计方法学 · 统计学 2026-03-17 Behzad Aalipur , Rachel Kilby

Demystifying Proximal Causal Inference

Proximal causal inference (PCI) has emerged as a promising framework for identifying and estimating causal effects in the presence of unobserved confounders. While many traditional causal inference methods rely on the assumption of no…

统计方法学 · 统计学 2026-03-17 Grace V. Ringlein , Trang Quynh Nguyen , Peter P. Zandi , Elizabeth A. Stuart , Harsh Parikh

Hierarchical Clustering With Confidence

Agglomerative hierarchical clustering is one of the most widely used approaches for exploring how observations in a dataset relate to each other. However, its greedy nature makes it highly sensitive to small perturbations in the data, often…

统计方法学 · 统计学 2026-03-17 Di Wu , Jacob Bien , Snigdha Panigrahi

Nonparametric efficient estimation of the longitudinal front-door functional

The front-door criterion is an identification strategy for the intervention-specific mean outcome in settings where the standard back-door criterion fails due to unmeasured exposure-outcome confounders, but an intermediate variable exists…

统计方法学 · 统计学 2026-03-17 Marie S. Breum , Helene C. W. Rytgaard , Torben Martinussen , Erin E. Gabriel

Sampling as Bandits: Evaluation-Efficient Design for Black-Box Densities

We propose bandit importance sampling (BIS), a powerful importance sampling framework tailored for settings in which evaluating the target density is computationally expensive. BIS facilitates accurate sampling while minimizing the required…

统计方法学 · 统计学 2026-03-17 Takuo Matsubara , Andrew Duncan , Simon Cotter , Konstantinos Zygalakis

Parsimonious Compactly Supported Covariance Models in the Gauss Hypergeometric Class: Identifiability, Reparameterizations, and Asymptotic Properties

We study covariance functions in the Gauss hypergeometric ($\mathcal{GH}$) class, a flexible family that encompasses the Generalized Wendland ($\mathcal{GW}$) and Mat\'ern ($\mathcal{MT}$) models. We derive sharp validity conditions,…

统计方法学 · 统计学 2026-03-17 Moreno Bevilacqua , Christian Caamaño-Carrillo , Tarik Faouzi , Xavier Emery

Controlling the false discovery rate in high-dimensional linear models using model-X knockoffs and $p$-values

We propose a novel multiple testing methodology for controlling the false discovery rate (FDR) in high-dimensional linear models that integrates model-X knockoff techniques with debiased penalized regression estimators. At the foundation of…

统计方法学 · 统计学 2026-03-17 Jinyuan Chang , Chenlong Li , Cheng Yong Tang , Zhengtian Zhu

Design of Bayesian Clinical Trials with Clustered Data

In the design of clinical trials, it is essential to assess the design operating characteristics (e.g., power and the type I error rate). Common practice for the evaluation of operating characteristics in Bayesian clinical trials relies on…

统计方法学 · 统计学 2026-03-17 Luke Hagar , Shirin Golchi

Inside-out cross-covariance for spatial multivariate data

As the spatial features of multivariate data are increasingly central in researchers' applied problems, there is a growing demand for novel spatially-aware methods that are flexible, easily interpretable, and scalable to large data. We…

统计方法学 · 统计学 2026-03-17 Michele Peruzzi

Detection of Multiple Influential Observations on Model Selection

Outlying observations are frequently encountered across a wide spectrum of scientific domains, posing notable challenges to the generalizability of statistical models and the reproducibility of downstream analysis. They are identified…

统计方法学 · 统计学 2026-03-17 Dongliang Zhang , Masoud Asgharian , Martin A. Lindquist

Sequential stratified inference for the mean

We develop conservative tests for the mean of a bounded population under stratified sampling and apply them to risk-limiting post-election audits. The tests are ``anytime valid'' under sequential sampling, allowing optional stopping in each…

统计方法学 · 统计学 2026-03-17 Jacob V. Spertus , Mayuri Sridhar , Philip B. Stark

Testing common structure in high-dimensional factor models: change-point and two-sample procedures

This work proposes a novel procedure to test for common structures across two high-dimensional factor models. The introduced test allows to uncover whether two factor models are driven by the same loading matrix up to some linear…

统计方法学 · 统计学 2026-03-17 Marie-Christine Düker , Vladas Pipiras

Combining Evidence Across Filtrations

In sequential anytime-valid inference, any admissible procedure must be based on e-processes: generalizations of test martingales that quantify the accumulated evidence against a composite null hypothesis at any stopping time. This paper…

统计方法学 · 统计学 2026-03-17 Yo Joong Choe , Aaditya Ramdas

Assessing Influential Observations in Pain Prediction using fMRI Data

Neuroimaging data allows researchers to model the relationship between multivariate patterns of brain activity and outcomes related to mental states and behaviors. However, the existence of outlying participants can potentially undermine…

统计方法学 · 统计学 2026-03-17 Dongliang Zhang , Masoud Asgharian , Martin A. Lindquist

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Estimating characteristics of domains (referred to as small areas) within a population from sample surveys of the population is an important problem in survey statistics. In this paper, we consider model-based small area estimation under…

统计方法学 · 统计学 2026-03-17 Ziyang Lyu , A. H. Welsh

When Your Model Stops Working: Anytime-Valid Calibration Monitoring

Practitioners monitoring deployed probabilistic models face a fundamental trap: any fixed-sample test applied repeatedly over an unbounded stream will eventually raise a false alarm, even when the model remains perfectly stable. Existing…

统计方法学 · 统计学 2026-03-16 Tristan Farran

TwoTimeScales: An R-package for Smoothing Hazards with Two Time Scales

Background: Time-to-event data with multiple time scales are observed in many epidemiological and clinical studies. While models that allow for simultaneous consideration of multiple time scales for the hazard of an event have been…

统计方法学 · 统计学 2026-03-16 Angela Carollo , Paul H. C. Eilers , Hein Putter , Jutta Gampe

Breaking the Winner's Curse with Bayesian Hybrid Shrinkage

The widespread adoption of randomized controlled trials (A/B Tests) for decision-making has introduced a pervasive "Winner's Curse": experiments selected for launch often exhibit upwardly biased effect estimates and invalid confidence…

统计方法学 · 统计学 2026-03-16 Richard Mudd , Abbas Zaidi , Rina Friedberg , Ilya Gorbachev , Anchal Choubey , Houssam Nassif