统计方法学 — Scifaro

Analysis of Linked Files: A Missing Data Perspective

In many applications, researchers seek to identify overlapping entities across multiple data files. Record linkage algorithms facilitate this task, in the absence of unique identifiers. As these algorithms rely on semi-identifying…

统计方法学 · 统计学 2026-04-24 Gauri Kamat , Roee Gutman

Estimating Fold Changes from Partially Observed Outcomes with Applications in Microbial Metagenomics

We consider the problem of estimating fold-changes in the expected value of a multivariate outcome observed with unknown sample-specific and category-specific perturbations. This challenge arises in high-throughput sequencing studies of the…

统计方法学 · 统计学 2026-04-24 David S Clausen , Sarah Teichman , Amy D Willis

ProfileGLMM: a R Package Extending Bayesian Profile Regression using Generalised Linear Mixed Models

ProfileGLMM is an R package integrating Generalised Linear Mixed Models (GLMMs) as the outcome model for Bayesian profile regression. This statistical framework simultaneously i) explains the variation in the outcome and ii) clusters the…

统计方法学 · 统计学 2026-04-23 Matteo Amestoy , Mark A. van de Wiel , Wessel N. van Wieringen

Data Integration for Estimating Subgroup-Specific Conditional Average Treatment Effects (CATEs) Using Coarsened External Information in Randomized Trials

Randomized controlled trials (RCTs) are often underpowered to detect treatment heterogeneity in subgroups defined by cross-classifications of multiple covariates, due to sparse sample sizes in some strata. External RCT data can help, but…

统计方法学 · 统计学 2026-04-23 Youqi Yang , Walter Dempsey , Bhramar Mukherjee

Double Robust Weighted Regression with Missing Confounders

Missing confounders are common in observational studies and present fundamental challenges for causal effect estimation by weakening identification and increasing sensitivity to model misspecification. Within the missing-indicator…

统计方法学 · 统计学 2026-04-23 Md. Shaddam Hossain Bagmar , Hua Shen

Dynamic Prediction of the Target Survival Time in Metastatic Solid Tumor Cancer Clinical Trials

Overall survival (OS) is the gold standard for assessing patient benefit and cost-effectiveness of new cancer drugs. However, it is often difficult to use OS as the primary endpoint in randomized clinical trials (RCTs) for patients with…

统计方法学 · 统计学 2026-04-23 Sidi Wang , Kelley Kidwell , Bo Huang , Satrajit Roychoudhury

Zero-Inflated Logistic Regression Models with Shared Design: Identifiability, Existence of Estimates, and a Relabeling Rule

The zero-inflated logistic regression model accommodates binary responses with excess zeros, which often arise from a latent mixture of susceptible and insusceptible subpopulations or asymmetric misclassification of the response. The model…

统计方法学 · 统计学 2026-04-23 Yui Tomo , Shinto Eguchi , Daisuke Yoneoka

A general nonparametric framework for testing hypotheses about function-valued parameters

We present a general nonparametric approach for testing whether a statistical parameter defined through conditional distributions is constant across the conditioning variables. Such hypotheses arise naturally in problems such as assessing…

统计方法学 · 统计学 2026-04-23 Albert Osom , Ali Shojaie , Aaron Hudson

Weighted Holm Procedures: Theory, Properties, and Recommendations

In many statistical applications, particularly in clinical studies, hypotheses may carry different levels of importance, motivating the use of weighted multiple testing procedures (wMTPs) to control the familywise error rate (FWER). Among…

统计方法学 · 统计学 2026-04-23 Beibei Li , Wenge Guo

Meta-analysis of networks of diagnostic tests with binary and continuous results

Network meta-analysis of diagnostic test accuracy (NMA-DTA) is a relatively new field, involving combining evidence across studies to evaluate and compare the accuracy of different tests for a given condition. However, the methods proposed…

统计方法学 · 统计学 2026-04-23 Efthymia Derezea , Gabriel Rogers , Nicky J Welton , Hayley E Jones

Constructing external comparator groups via transportability in mean or in effect measure

Learning about causal effects in target populations and their subsets may be facilitated by combining information from multiple sources. One major class of study designs that combine information involves appending an index study with data…

统计方法学 · 统计学 2026-04-23 Lawson Ung , Guanbo Wang , Sebastien Haneuse , Sonia Hernandez-Diaz , Miguel A. Hernán , Issa J. Dahabreh

Principal Nested Cones

In many applications, the data lie on a type of cone, where there is a distinction between an overall scale variable and the remaining scale-free structure. For example, the joint size and shape of objects are points on a cone, where size…

统计方法学 · 统计学 2026-04-23 Yanyan Zhan , Ian L. Dryden , Yuexuan Wu

A Bayes-Factor-Guided Approach to Post-Double Selection with Bootstrapped Multiple Imputation

When variable selection methods are applied to bootstrapped and multiply imputed datasets, the set of selected variables typically varies across iterations. Aggregating results via the union rule can lead to overly dense models. We propose…

统计方法学 · 统计学 2026-04-23 Johannes Bleher , Claudia Tarantola

Spatial deformation in a Bayesian spatiotemporal model for incomplete matrix-variate responses

In this paper, we propose a Bayesian matrix-variate spatiotemporal modeling framework for jointly analyzing multiple response variables observed at spatial locations over time. The approach relaxes the standard assumption of spatial…

统计方法学 · 统计学 2026-04-23 Rodrigo de Souza Bulhões , Marina Silva Paez , Dani Gamerman

Scalable Bayesian inference for high-dimensional mixed-type multivariate spatial data

Spatial generalized linear mixed-effects models are popularly used to analyze spatially indexed univariate responses. However, with modern technology, it is common to observe vector-valued mixed-type responses, e.g., a combination of…

统计方法学 · 统计学 2026-04-23 Arghya Mukherjee , Arnab Hazra , Dootika Vats

Efficient Log-Rank Updates for Random Survival Forests

Random survival forests are widely used for estimating covariate-conditional survival functions under right-censoring. Their standard log-rank splitting criterion is typically recomputed at each candidate split. This O(M) cost per split,…

统计方法学 · 统计学 2026-04-23 Erik Sverdrup , James Yang , Michael LeBlanc

Two-sample comparison through additive tree models for density ratios

The ratio of two densities provides a direct characterization of their differences. We consider the two-sample comparison problem by estimating this ratio given i.i.d. observations from two distributions. To this end, we propose additive…

统计方法学 · 统计学 2026-04-23 Naoki Awaya , Yuliang Xu , Li Ma

Adaptive Multi-task Learning for Multi-sector Portfolio Optimization

Accurate transfer of information across multiple sectors to enhance model estimation is both significant and challenging in multi-sector portfolio optimization involving a large number of assets in different classes. Within the framework of…

统计方法学 · 统计学 2026-04-23 Qingliang Fan , Ruike Wu , Yanrong Yang

A Robust Nonparametric Framework for Detecting Repeated Spatial Patterns

Identifying spatially contiguous clusters and repeated spatial patterns (RSP) characterized by similar underlying distributions that are spatially apart is a key challenge in modern spatial statistics. Existing constrained clustering…

统计方法学 · 统计学 2026-04-23 Rajitha Senanayake , Pratheepa Jeganathan

Anytime-valid simultaneous lower confidence bounds for the true discovery proportion

We propose a method that combines the closed testing framework with the concept of safe anytime-valid inference (SAVI) to compute lower confidence bounds for the true discovery proportion in a multiple testing setting. The proposed…

统计方法学 · 统计学 2026-04-23 Friederike Preusse