统计方法学 — Scifaro

Evaluating HWE and Association in Genome Wide Association Studies: A Unified Procedure

In genome wide association studies (GWASs) based on a case-control design, single nucleotide polymorphisms (SNPs) are typically evaluated for an association test and a Hardy-Weinberg equilibrium (HWE) goodness-of-fit test. SNPs are then…

统计方法学 · 统计学 2026-06-29 Stefan Böhringer , Hajo Holzmann

Beyond Equidistant Assumptions: An Autoregressive Ordered Stereotype Model for Ordinal Time Series

We propose an extension of the ordered stereotype model (OSM) for ordinal time series data, referred to as the Autoregressive OSM (AR-OSM). The model captures serial dependence by incorporating lagged values of the response as covariates in…

统计方法学 · 统计学 2026-06-29 Anna Nalpantidi , Dimitris Karlis , Daniel Fernández

Scalable coarse-to-fine spatial downscaling

This study proposes coarse-to-fine downscaling (CF-DS), a scalable spatial downscaling method extending coarse-to-fine spatial modeling. Unlike conventional spatial-statistical downscaling methods such as area-to-point kriging, CF-DS does…

统计方法学 · 统计学 2026-06-29 Daisuke Murakami , Yongwan Chun , Takahiro Yoshida , Hajime Seya

HERO: Improving the Reliability and Sensitivity of Generative Model Evaluation Using Historical Data

Reliable generative AI models critically rely on expert human annotations to evaluate output quality, yet these "gold" labels are expensive to collect and limited in quantity. Organizations thus often turn to collecting vast but noisy…

统计方法学 · 统计学 2026-06-29 Xinrui Ruan , Zhenyu Zhao , Waverly Wei , Yueshan Zhang , Zeyu Zheng , Sui Huang , Jingshen Wang

Testing hypotheses via orthogonalization

Classical hypothesis testing frameworks break down in contemporary settings in which null hypotheses are increasingly abstract, the same data are used to both generate and test hypotheses, and minimal assumptions about the underlying data…

统计方法学 · 统计学 2026-06-29 Ameer Dharamshi , Runjia Zou , Daniela Witten

Multi-Source Transfer Learning of Sparse Single-Index Models

Transfer learning leverages knowledge from related source domains to improve learning in a target domain. Recent theoretical advances cover a broad range of regression settings within (generalized) linear models. Despite their diversity,…

统计方法学 · 统计学 2026-06-28 Ye Tian

Beyond Local Independence: High-Dimensional Latent Class Graphical Models with Shared Block Structure

Latent class models are central tools for multivariate categorical data from heterogeneous populations, but their standard local-independence assumption is often unrealistic in modern high-dimensional applications. We propose a…

统计方法学 · 统计学 2026-06-28 Seunghyun Lee , Yuqi Gu

Modelling and detecting mild and gross anomalies in circular data via double-contaminated models

In this paper, we propose a model-based framework to robustify inference for circular data in the presence of anomalous observations, distinguishing between mild and gross anomalies. Starting from a unimodal and symmetric reference model on…

统计方法学 · 统计学 2026-06-28 Antonio Punzo , Andriëtte Bekker , Arno Otto , Priyanka Nagar , Cristina Tortora

Scalable Bayesian Spatial Mixture Modelling for Remote Sensing Image Segmentation

Accurate and scalable land cover classification is essential for global conservation monitoring and policy-making. While remote sensing images provide a cost-effective alternative to ground surveys, current methods often lack principled…

统计方法学 · 统计学 2026-06-28 Bao Khanh Nguyen , Iain Cameron , Cecilia Balocchi , Torben Sell

Multivariate Varying-Coefficient BART with Graphical Horseshoe Priors

Modern multivariate regression problems involve several related outcomes whose regression effects are not only nonlinear, heterogeneous, and outcome-specific, but also where the residual dependence among outcomes is scientifically…

统计方法学 · 统计学 2026-06-27 Soham Ghosh , Sameer K. Deshpande

Panel Flow Matching: A Generative Approach to Learning Distributions of Longitudinal Data

Learning distributions of longitudinal data is central to tasks such as visualization, completion, classification, and synthetic data generation, but it remains statistically challenging because longitudinal observations are often…

统计方法学 · 统计学 2026-06-27 Jianbin Tan , Pixu Shi , Anru R. Zhang

Learning heterogeneous treatment effects under principal stratification

Principal stratification provides a foundational framework for causal inference with intermediate outcomes by defining causal effects within subpopulations, yet existing work has largely focused on average effects across strata rather than…

统计方法学 · 统计学 2026-06-27 Jiaqi Tong , Fan Li

On Modeling Cylindrical Data with a Discrete Circular Component and Its Environmental Applications

Standard statistical methods are often inadequate for modeling the joint dependence between linear and circular variables, and existing methods for modeling this dependence are designed only for continuous variables. However, circular data…

统计方法学 · 统计学 2026-06-27 Brajesh Kumar Dhakad , Jayant Jha

Beta-trees for testing multivariate goodness-of-fit and localizing deviations from a model

We introduce a novel goodness-of-fit (GOF) procedure based on Beta-tree partitions. A Beta-tree produces a data-adaptive partition of the sample space into regions and provides guaranteed finite sample confidence intervals for the…

统计方法学 · 统计学 2026-06-27 Valerie N. P. Ho , Guenther Walther

Generated outcomes as generated regressors: Equivalences in recursive causal estimation

Time-varying treatment effects, surrogate-identified treatment effects, and mediation effects can all be written as recursive regressions, in which each regression's predicted values become generated outcomes for the next regression. We…

统计方法学 · 统计学 2026-06-27 Wisse Rutgers , Rahul Singh

Measurement Induced Confounding

A critical assumption of observational studies is that all confounding variables must be known and sufficiently adjusted for to estimate causal effects. An implicit, and often overlooked, aspect of this assumption is that all confounding…

统计方法学 · 统计学 2026-06-27 George Perrett , Klint Kanopka

Inferring Comprehensive Cohort Causal Effects in the Presence of Unmeasured Confounding and Missing Outcomes

This paper presents a methodological framework for estimating the comprehensive cohort causal effect (CCCE) in mixed-design clinical studies that combine randomized controlled trials (RCTs) and parallel observational study (OBS). Our…

统计方法学 · 统计学 2026-06-27 Shiyao Xu , Razieh Nabi , Martin Underwood , Daniel Scharfstein

Composition as Direction: An Active-Set Ray-Based Model for Sparse High-Dimensional Compositional Data

[Working Draft] Compositional data are central to microbial, ecological, and environmental research, yet often have four features that are difficult to accommodate jointly: exact zeros, latent dependence among components,…

统计方法学 · 统计学 2026-06-27 Michael R Schwob , Jyotishka Datta

Inverse Probability Weighting in a Post-Bayesian World

We present a justification of the use of Inverse Probability Weighting (IPW) in a post-Bayesian framework, in which the bias-correction provided by IPW in a frequentist context is reframed as a reweighting of the Kullback-Leibler (KL)…

统计方法学 · 统计学 2026-06-27 Owen Thomas , William Denault , Valeria Vitelli

A bootstrap approach to prediction-powered inference

Prediction-powered inference (PPI) refers to a two-level situation where the statistician observes a set of $(x,y)$ pairs and another set of $x$s with the responses $y$ missing. Also available is some independent background data from which…

统计方法学 · 统计学 2026-06-26 Bradley Efron