统计方法学 — Scifaro

Identifying the potential of sample overlap in evidence synthesis of observational studies

Sample overlap is a common issue in evidence synthesis in the field of medical research, particularly when integrating findings from observational studies utilizing existing databases such as registries. Due to the general inaccessibility…

统计方法学 · 统计学 2026-02-26 Zhentian Zhang , Tim Friede , Tim Mathes

An index of effective number of variables for uncertainty and reliability analysis in model selection problems

An index of an effective number of variables (ENV) is introduced for model selection in nested models. This is the case, for instance, when we have to decide the order of a polynomial function or the number of bases in a nonlinear…

统计方法学 · 统计学 2026-02-26 Luca Martino , Eduardo Morgado , Roberto San Millán-Castillo

Evaluating time-varying treatment effects in hybrid SMART-MRT designs

Recently a new experimental approach, the hybrid experimental design (HED), was introduced to enable investigators to answer scientific questions about building behavioral interventions in which human-delivered and digital components are…

统计方法学 · 统计学 2026-02-26 Mengbing Li , Inbal Nahum-Shani , Walter Dempsey

Discussion of "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models"

Choi and Yuan (2025) propose a novel approach to applying matrix completion to the problem of estimating causal effects in panel data. The key insight is that even in the presence of structured patterns of missing data -- i.e. selection…

统计方法学 · 统计学 2026-02-26 Eli Ben-Michael , Avi Feller

Error-Controlled Borrowing from External Data Using Wasserstein Ambiguity Sets

Incorporating external data can improve the efficiency of clinical trials, but distributional mismatches between current and external populations threaten the validity of inference. While numerous dynamic borrowing methods exist, the…

统计方法学 · 统计学 2026-02-26 Yui Kimura , Shu Tamano

The generalized underlap coefficient with an application in clustering

Quantifying distributional separation across groups is fundamental in statistical learning and scientific discovery, yet most classical discrepancy measures are tailored to two-group comparisons. We generalize the underlap coefficient…

统计方法学 · 统计学 2026-02-26 Zhaoxi Zhang , Vanda Inacio , Sara Wade

Modi linear failure rate distribution with application to survival time data

A new lifetime model, named the Modi linear failure rate distribution, is suggested. This flexible model is capable of accommodating a wide range of hazard rate shapes, including decreasing, increasing, bathtub, upside-down bathtub, and…

统计方法学 · 统计学 2026-02-26 Lazhar Benkhelifa

Estimating the growth rate of a birth and death process using data from a small sample

The problem of estimating the growth rate of a birth and death processes based on the coalescence times of a sample of $n$ individuals has been considered by several authors (\cite{stadler2009incomplete, williams2022life,…

统计方法学 · 统计学 2026-02-26 Carola Sophia Heinzel , Jason Schweinsberg

Goodness-of-fit test for multi-layer stochastic block models

Community detection in multi-layer networks is a fundamental task in complex network analysis across various areas like social, biological, and computer sciences. However, most existing algorithms assume that the number of communities is…

统计方法学 · 统计学 2026-02-26 Huan Qing

Semi-parametric bulk and tail regression using spline-based neural networks

Semi-parametric quantile regression (SPQR) is a flexible approach to density regression that learns a spline-based representation of conditional density functions using neural networks. As it makes no parametric assumptions about the…

统计方法学 · 统计学 2026-02-26 Reetam Majumder , Jordan Richards

A Dynamic Factor Model for Multivariate Counting Process Data

We propose a dynamic multiplicative factor model for process data, which arise from complex problem-solving items, an emerging testing mode in large-scale educational assessment. The proposed model can be viewed as an extension of the…

统计方法学 · 统计学 2026-02-26 Fangyi Chen , Hok Kan Ling , Zhiliang Ying

Grade of membership analysis for multi-layer ordinal categorical data

Consider a group of individuals (subjects) participating in the same psychological tests with numerous questions (items) at different times, where the choices of each item have an implicit ordering. The observed responses can be recorded in…

统计方法学 · 统计学 2026-02-26 Huan Qing

Infer-and-widen, or not?

In recent years, there has been substantial interest in the task of selective inference: inference on a parameter that is selected from the data. Many of the existing proposals fall into what we refer to as the \emph{infer-and-widen}…

统计方法学 · 统计学 2026-02-26 Ronan Perry , Zichun Xu , Olivia McGough , Daniela Witten

Sparse outlier-robust PCA for multi-source data

Sparse and outlier-robust Principal Component Analysis (PCA) has been a very active field of research recently. Yet, most existing methods apply PCA to a single dataset whereas multi-source data-i.e. multiple related datasets requiring…

统计方法学 · 统计学 2026-02-26 Patricia Puchhammer , Ines Wilms , Peter Filzmoser

Reconciling Overt Bias and Hidden Bias in Sensitivity Analysis for Matched Observational Studies

Matching is one of the most widely used causal inference designs in observational studies, but post-matching confounding bias remains a critical concern. This bias includes overt bias from inexact matching on measured confounders and hidden…

统计方法学 · 统计学 2026-02-26 Siyu Heng , Yanxin Shen , Pengyun Wang

Generalized Bayesian Multidimensional Scaling and Model Comparison

Multidimensional scaling (MDS) is widely used to reconstruct a low-dimensional representation of high-dimensional data while preserving pairwise distances. However, Bayesian MDS approaches based on Markov chain Monte Carlo (MCMC) face…

统计方法学 · 统计学 2026-02-26 Jiarui Zhang , Jiguo Cao , Liangliang Wang

A Time-Varying and Covariate-Dependent Correlation Model for Multivariate Longitudinal Studies

In multivariate longitudinal studies, associations between outcomes often exhibit time-varying and individual level heterogeneity, motivating the modeling of correlations as an explicit function of time and covariates. However, most…

统计方法学 · 统计学 2026-02-25 Qingzhi Liu , Gen Li , Anastasia K. Yocum , Melvin McInnis , Brian D. Athey , Veerabhadran Baladandayuthapani

Robust and Sparse Generalized Linear Models for High-Dimensional Data via Maximum Mean Discrepancy

High-dimensional datasets are frequently subject to contamination by outliers and heavy-tailed noise, which can severely bias standard regularized estimators like the Lasso. While Maximum Mean Discrepancy (MMD) has recently been introduced…

统计方法学 · 统计学 2026-02-25 Xiaoning Kang , Lulu Kang

Empirically Calibrated Conditional Independence Tests

Conditional independence tests (CIT) are widely used for causal discovery and feature selection. Even with false discovery rate (FDR) control procedures, they often fail to provide frequentist guarantees in practice. We highlight two common…

统计方法学 · 统计学 2026-02-25 Milleno Pan , Antoine de Mathelin , Wesley Tansey

Exchangeable Gaussian Processes for Staggered-Adoption Policy Evaluation

We study the use of exchangeable multi-task Gaussian processes (GPs) for causal inference in panel data, applying the framework to two settings: one with a single treated unit subject to a once-and-for-all treatment and another with…

统计方法学 · 统计学 2026-02-25 Hayk Gevorgyan , Konstantinos Kalogeropoulos , Angelos Alexopoulos