统计方法学 — Scifaro

On the Conservativeness of Robust Variance Estimators in Propensity Score Weighted Cox Models

In propensity score weighted analysis, robust variance that does not account for weight estimation is commonly used. In propensity score weighted Cox models (CoxPSW), the robust variance is known to be conservative when weights for the…

统计方法学 · 统计学 2026-04-17 Hiroya Morita , Shunichiro Orihara , Fumitaka Shimizu , Masataka Taguri

Adaptive Multi-Prior Lasso for High-Dimensional Generalized Linear Models

Incorporation of external information into high-dimensional modeling for gene expression data has been shown, both theoretically and empirically, to substantially enhance performance. Such external information, sometimes referred to as…

统计方法学 · 统计学 2026-04-17 Fuzhi Xu , Weijuan Liang , Shuangge Ma , Qingzhao Zhang

Ranked-choice conjoint experiments

Forced-choice conjoint designs have become a staple method in the experimentalist's toolkit. However, the forced-choice outcome is neither always consistent with the types of choices individuals make in real political contexts, nor is it…

统计方法学 · 统计学 2026-04-17 Thomas S. Robinson , Mats Ahrenshop , Spyros Kosmidis

Model Checking for Regressions Based on Weighted Residual Processes with Diverging Number of Predictors

The integrated conditional moment (ICM) test is a classical and widely used method for assessing the adequacy of regression models. Although it performs well in fixed-dimension settings, its behavior changes dramatically when the predictor…

统计方法学 · 统计学 2026-04-17 Yue Hu , Haiqi Li , Xintao Xia

Bayesian sparse principal coordinates analysis with delta-tolerant linear approximation for microbiome data

Principal coordinates analysis (PCoA) is a standard exploratory tool for microbiome beta-diversity studies, but its axes are defined by pairwise dissimilarities and therefore do not directly identify the taxa driving an ordination. We…

统计方法学 · 统计学 2026-04-17 Hsin-Hsiung Huang , Ruitao Liu , Liangliang Zhang , Shao-Hsuan Wang

Bayesian Node-Level Outlier Detection for Graph Signals

This paper proposes a fully Bayesian framework for node-level outlier detection in graph signals, where measurements are observed on the nodes of an underlying graph. Unlike traditional outlier detection methods, our approach accounts for…

统计方法学 · 统计学 2026-04-17 Seongmin Kim , Kyusoon Kim

Propensity Score Weighting to Ensure Balance in Key Subgroups or Strata: A Practical Guide

Propensity score weighting approaches have been widely implemented in clinical research to estimate the effects of a treatment or exposure while mitigating the risk of confounding in the absence of random assignment. In practice, when…

统计方法学 · 统计学 2026-04-17 Emma K. Mackay , Amol A. Verma , Fahad Razak , Surain B. Roberts

Deployment of AI-Assisted Interventions: Capacity Constraints and Noisy Compliance

AI tools increasingly guide targeted interventions in healthcare, education, and recruiting. Algorithms score individuals, trigger outreach to those above a threshold (e.g., high-risk or high-value), and encourage them to request service;…

统计方法学 · 统计学 2026-04-17 Carri W. Chan , Yi Han , Hannah Li , Benjamin L. Ranard

PROXIMA: A Reliability Scoring Framework for Proxy Metrics in Online Controlled Experiments

Online A/B testing at scale relies on proxy metrics -- short-term, easily-measured signals used in place of slow-moving long-term outcomes. When the proxy-outcome relationship is heterogeneous across user segments, aggregate correlation can…

统计方法学 · 统计学 2026-04-17 Avinash Amudala

Combining Bayesian and Frequentist Inference for Laboratory-Specific Performance Guarantees in Copy Number Variation Detection

Targeted amplicon panels are widely used in oncology diagnostics, but providing per-gene performance guarantees for copy number variant (CNV) detection remains challenging due to amplification artifacts, process-mismatch heterogeneity, and…

统计方法学 · 统计学 2026-04-17 Austin Talbot , Alex V. Kotlar , Yue Ke

Cellwise Outliers

In statistics and machine learning, the traditional meaning of the terms `outlier' and `anomaly' is a case in the dataset that behaves differently from the bulk of the data. This raises suspicion that it may belong to a different…

统计方法学 · 统计学 2026-04-17 Mia Hubert , Jakob Raymaekers , Peter J. Rousseeuw

Prior Smoothing for Multivariate Disease Mapping Models

To date, we have seen the emergence of a large literature on multivariate disease mapping. That is, incidence of (or mortality from) multiple diseases is recorded at the scale of areal units where incidence (mortality) across the diseases…

统计方法学 · 统计学 2026-04-17 Garazi Retegui , María Dolores Ugarte , Jaione Etxeberria , Alan E. Gelfand

Propensity Score Propagation: A General Framework for Design-Based Inference with Unknown Propensity Scores

Design-based inference, also known as randomization-based or finite-population inference, provides a principled framework for trustworthy statistical inference by attributing randomness solely to the design mechanism (e.g., treatment…

统计方法学 · 统计学 2026-04-17 Siyu Heng , Yanxin Shen , Zijian Guo

Model-Free Assessment of Simulator Fidelity via Quantile Curves

As generative AI models are increasingly used to simulate real-world systems, quantifying the ``sim-to-real'' gap is critical. For each input setting of interest -- which we call a \emph{scenario}, such as a survey question or operating…

统计方法学 · 统计学 2026-04-17 Garud Iyengar , Yu-Shiou Willy Lin , Kaizheng Wang

Causal Decomposition Analysis with Synergistic Interventions: A Triply-Robust Machine Learning Approach to Addressing Multiple Dimensions of Social Disparities

Educational disparities are rooted in and perpetuate social inequalities across multiple dimensions such as race, socioeconomic status, and geography. To reduce disparities, most intervention strategies focus on a single domain and…

统计方法学 · 统计学 2026-04-17 Soojin Park , Su Yeon Kim , Xinyao Zheng , Chioun Lee

Measuring multi-calibration

A suitable scalar metric can help measure multi-calibration, defined as follows. When the expected values of observed responses are equal to corresponding predicted probabilities, the probabilistic predictions are known as "perfectly…

统计方法学 · 统计学 2026-04-17 Ido Guy , Daniel Haimovich , Fridolin Linder , Nastaran Okati , Lorenzo Perini , Niek Tax , Mark Tygert

Enhancing Inference for Small Cohorts via Transfer Learning and Weighted Integration of Multiple Datasets

Lung sepsis remains a significant concern in the Northeastern U.S., yet the national eICU Collaborative Database includes only a small number of patients from this region, highlighting underrepresentation. Understanding clinical variables…

统计方法学 · 统计学 2026-04-17 Subharup Guha , Mengqi Xu , Yi Li

The Promises of Multiple Experiments: Identifying Joint Distribution of Potential Outcomes

Typical causal effects are defined based on the marginal distribution of potential outcomes. However, many real-world applications require causal estimands involving the joint distribution of potential outcomes to enable more nuanced…

统计方法学 · 统计学 2026-04-17 Peng Wu , Xiaojie Mao

Association measures for two-way contingency tables based on multi-categorical proportional reduction in error

In two-way contingency tables under an asymmetric situation, where the row and column variables are defined as explanatory and response variables, respectively, quantifying the extent to which the explanatory variable contributes to…

统计方法学 · 统计学 2026-04-17 Wataru Urasaki , Kouji Tahata , Sadao Tomizawa

Improving Treatment Effect Estimation in Trials through Adaptive Borrowing of External Controls

Randomized controlled trials (RCTs) often suffer from limited inferential efficiency in estimating treatment effects due to their small sample sizes. In recent years, incorporating external controls (ECs) has gained increasing attention as…

统计方法学 · 统计学 2026-04-16 Qinwei Yang , Jingyi Li , Peng Wu , Shu Yang