统计方法学 — Scifaro

CavMerge: Merging K-means Based on Local Log-Concavity

K-means clustering, a classic and widely-used clustering technique, is known to exhibit suboptimal performance when applied to non-linearly separable data. Numerous adjustments and modifications have been proposed to address this issue,…

统计方法学 · 统计学 2026-04-07 Zhili Qiao , Wangqian Ju , Peng Liu

Simulated Annealing for Model-Robust Partial Profile Choice Designs in Healthcare Preference Studies

Discrete Choice Experiments (DCEs) investigate participants' preferences by observing their choice behavior in hypothetical scenarios and are widely used in the domain of healthcare. To reduce participants' cognitive burden, especially when…

统计方法学 · 统计学 2026-04-07 Yicheng Mao , Roselinde Kessels

Efficient estimation of relative risk, odds ratio and their logarithms for rare events

Sequential estimators are proposed for the relative risk, odds ratio, log relative risk or log odds ratio of a dichotomous attribute in two populations. The estimators take the same number of observations from each population, and guarantee…

统计方法学 · 统计学 2026-04-07 Luis Mendo

Variance Reduction Methods for Dirichlet Expectations

Dirichlet distributions are probability measures on the unit simplex. They are often used as prior distributions in modeling categorical data, such as in topic analysis of text data. Motivated by this application, we consider Monte Carlo…

统计方法学 · 统计学 2026-04-07 Ayeong Lee

Bootstrap-Aggregated Method-of-Moments Estimation of the Copula Correlation Parameter for Marginal Survival Inference under Dependent Censoring

In dependently censored survival data, the usual assumption of independent censoring or an incorrect specification of the correlation between the event and censoring times can bias marginal survival inference. Likelihood-based estimation of…

统计方法学 · 统计学 2026-04-07 Hyun-Soo Zhang , Inkyung Jung , Chung Mo Nam

Learning association from multiple intermediate events for dynamic prediction of survival: an application to cardiovascular disease prognosis

Cardiovascular diseases are major causes of mortality globally. They often co-occur and are interrelated, leading to partial-order relationships among their onset times. However, these onset times are subject to informative censoring due to…

统计方法学 · 统计学 2026-04-07 Tonghui Yu , Liming Xiang

Fused Multinomial Logistic Regression Utilizing Summary-Level External Machine-learning Information

In many modern applications, a carefully designed primary study provides individual-level data for interpretable modeling, while summary-level external information is available through black-box, efficient, and nonparametric…

统计方法学 · 统计学 2026-04-07 Chi-Shian Dai , Jun Shao

Estimation of treatment effect in clinical trials of continuous endpoints with retrieved dropouts

The estimand framework provides guidance on handling intercurrent events, such as treatment discontinuation, in the analysis of clinical trial responses. Under ICH E9(R1), the treatment policy (TP) strategy incorporates post-discontinuation…

统计方法学 · 统计学 2026-04-07 Myeongjong Kang , Sangyoon Yi

New insights into Elo algorithm for practitioners and statisticians

This work reconciles two perspectives on the Elo ranking that coexist in the literature: the practitioner's view as a heuristic feedback rule, and the statistician's view as online maximum likelihood estimation via stochastic gradient…

统计方法学 · 统计学 2026-04-07 Leszek Szczecinski

Confidence Intervals for Rate Estimation with Importance Sampling in Autonomous Vehicle Evaluation

Accounting for both rare events and complex sampling presents challenges when quantifying uncertainty for rate estimation in autonomous vehicle performance evaluation. In this paper, we introduce a statistical formulation of this problem…

统计方法学 · 统计学 2026-04-07 Aiyou Chen , Ruixuan Rachel Zhou , Joseph J. Lee , Nicholas Chamandy , Henning Hohnhold

A test for normality based on self-similarity

Testing for normality is a widely used procedure in statistics and data analysis, often applied prior to employing methods that rely on the assumption of normally distributed data. While several existing tests target distributional…

统计方法学 · 统计学 2026-04-07 Akin Anarat , Holger Schwender

Spherically Embedded Time Series with Unknown Trend and Periodic Components

Spherically embedded time series are time series with values naturally residing on or can be equivalently mapped to the sphere. Despite their ubiquity in diverse scientific fields, these data frequently exhibit complex non-stationarity…

统计方法学 · 统计学 2026-04-07 Jiazhen Xu , Han Lin Shang

Multilevel Regression Discontinuity Models with Latent Variables

Regression discontinuity (RD) analysis with latent variables as introduced by Morell et al. (2025), offers a useful augmentation of the conventional RD by incorporating measurement model. This approach is particularly relevant in education…

统计方法学 · 统计学 2026-04-07 Monica Morell , Youngjin Han , Muwon Kwon , Youjin Sung , Yang Liu , Ji Seung Yang

Robust Standard Errors for Bayesian Posterior Functionals via the Infinitesimal Jackknife

Quantitative research in the social and behavioral sciences relies heavily on nonlinear posterior functionals such as indirect effects, standardized coefficients, effect sizes, intraclass correlations, and multilevel variance-explained…

统计方法学 · 统计学 2026-04-07 Nanyu Luo , Feng Ji

Two Sample Test for Eigendecompositions of Functional Data

Neuron-level firing data is believed to be governed by latent activation patterns during task completion. Analysing repeated trials of a task allows us to study these patterns, typically by averaging in-vivo neural spikes across trials.…

统计方法学 · 统计学 2026-04-07 Angel Garcia de la Garza , Britton Sauerbrei , Jeff Goldsmith

Making Effective Statistical Inferences: From Significance Testing to the Open Science Inference Ecosystem (2016-2026)

Statistical inference has undergone a profound transformation over the past decade, evolving from a significance-testing paradigm toward a comprehensive, transparency-driven framework embedded within the broader open science ecosystem.…

统计方法学 · 统计学 2026-04-07 Aswini Kumar Patra

A Novel Three-Parameter Extended Weibull Distribution for Health Data Modelling

Weibull distribution is widely used in modelling health data. However, its lack of sufficient tail flexibility often results in poor fit in extreme events. We proposed another three-parameter extension of the Weibull distribution with…

统计方法学 · 统计学 2026-04-07 Isqeel Ogunsola , Nurudeen Ajadi , Gboyega Adepoju

Power Analysis is Essential: High-Powered Tests Suggest Minimal to No Effect of Rounded Shapes on Click-Through Rates

Underpowered studies (below 50% power) suffer from the winner's curse: A statistically significant positive estimate must exaggerate the true treatment effect to meet the significance threshold. A study by Dipayan Biswas, Annika Abell, and…

统计方法学 · 统计学 2026-04-07 Ron Kohavi , Jakub Linowski , Lukas Vermeer , Fabrice Boisseranc , Joachim Furuseth , Andrew Gelman , Guido Imbens , Ravikiran Rajagopal

A fine-grained look at causal effects in causal spaces

The notion of causal effect is fundamental across many scientific disciplines. Traditionally, quantitative researchers have studied causal effects at the level of variables; for example, how a certain drug dose (W) causally affects a…

统计方法学 · 统计学 2026-04-07 Junhyung Park , Yuqing Zhou

A Bayes-Motivated Quadratic-Form Test for High-Dimensional Mean Testing

We propose a two-sample mean test based on the Bayes factor with non-informative priors, specifically designed for scenarios where the dimension $p$ grows with the sample size $n$ with a linear rate $p/n \to c_1 \in (0, \infty)$. We…

统计方法学 · 统计学 2026-04-07 Daojiang He , Suren Xu , Jing Zhou