Statistics
The growing use of high-throughput sequencing (HTS) has enabled the large-scale production of compositional count data, driving progress in microbiome research. However, such count data are often high-dimensional, over-dispersed, and…
p-hacking occurs when researchers conduct multiple significance tests (e.g., p1;H0,1 and p2;H0,2) and then selectively report tests that yield desirable (usually significant) results (e.g., p2 < 0.05;H0,2) without correcting for multiple…
For many years I have taught an advanced statistical inference course for master's students using the text of Casella and Berger (2002). The book gives a comprehensive treatment of the core topics at a level that avoids measure theory while…
Through case studies, we demonstrate how multiverse analysis can strengthen the robustness and transparency of computational social science findings against alternative methodological decisions. We conduct multiverse analyses of three…
Hilbert's sixth problem calls for the axiomatization of physics, particularly the derivation of macroscopic statistical laws from microscopic mechanical principles. A conceptual difficulty arises in classical probability theory: in…
Suppose some cleverness score parameter is sufficiently interesting to be defined and then measured, perhaps for different strata of specialists or for the broader population. Such phenomena could have Gaussian distributions, when it comes…
Born in the late 20s, R is one of the most popular software for statistical computing and graphics. With the development of information technology and the advent of the big data era, great changes have taken place in the R ecosystem. Based…
To solve the problem of detecting subspace signals in nonzero-mean clutter, we propose adaptive detectors, based on the strategies of generalized likelihood ratio test (GLRT), Rao test, Wald test, gradient test, and Durbin test. The results…
Since its introduction by Fisher, the method of hypothesis testing that relies on computing error probabilities has witnessed several developments. Perhaps the most significant development was the seminal contributions of Neyman and Pearson…
Express transportation network design is uncertain because origin--destination demand, travel time, operating cost, hub congestion, and realized sorting productivity vary over time. Existing multi-topology express network models usually…
This paper introduces the Statverse, a Metaverse framework designed to revolutionize statistical education in the digital age. Our key goal is to report our progress and encourage others to integrate similar strategies into their programs.…
This study develops a functional Liu-type shrinkage estimator (fLiu) for scalar-on-function regression in the presence of strong multicollinearity and high-dimensional functional predictors. The approach extends the classical Liu estimator…
Networked systems operating under intermittent adverse conditions and long memory can remain stable on average while exhibiting rare but extreme trajectory-level excursions. We study linear regime-switching network dynamics with…
Accurate fetal birth weight prediction is a cornerstone of prenatal care, yet traditional methods often rely on imaging technologies that remain inaccessible in resource-limited settings. This study presents a novel machine learning-based…
Large language models are thought to have the potential to aid in medical decision making. This work investigates the degree to which this might be the case. We start with the treatment problem, the patient's core medical decision-making…
The sudoku puzzles have a long history, with variations going back more than a hundred years, but its current and perhaps surprising world-wide prominence goes back to certain initiatives and then puzzle-generating computer programmes from…
This study investigates perceptions and use of generative artificial intelligence (GenAI) tools among students and faculty in statistics and data science at a historically Black college or university. Survey data from 119 valid student…
Discussion on ``Regression by Composition'' by Farewell, Daniel, Stensrud, and Huitfeldt
Discussion on "Regression by Composition" by Farewell, Daniel, Stensrud, and Huitfeldt.
Artificial intelligence (AI) systems increasingly shape how people access health information, make medical decisions, and receive care -- yet epidemiology lacks frameworks for measuring AI exposure or studying its health effects at the…