应用统计
The relationship between skin diseases and mental illnesses has been extensively studied using cross-sectional epidemiological data. Typically, such data can only measure association (rather than causation) and include only a subset of the…
Quantifying the complexity and irregularity of time series data is a primary pursuit across various data-scientific disciplines. Sample entropy (SampEn) is a widely adopted metric for this purpose, but its reliability is sensitive to the…
This paper underscores the vital role of the chi-square test within political science research utilizing structural equation modeling (SEM). The ongoing debate regarding the inclusion of chi-square test statistics alongside fit indices in…
The hot-hand theory posits that an athlete who has performed well in the recent past performs better in the present. We use multilevel logistic regression to test this theory for National Hockey League playoff goaltenders, controlling for a…
This paper investigates projection of two major causes of cancer mortality, breast cancer and lung cancer, by using a Bayesian modelling framework. We investigate patterns in 2001-2018 (as baseline) in cause-specific cancer mortality and…
Research and development (R&D) of countries play a major role in a long-term development of the economy. We measure the R&D efficiency of all 28 member countries of the European Union in the years 2008--2014. Super-efficient data…
The categorization of retail products is essential for the business decision-making process. It is a common practice to classify products based on their quantitative and qualitative characteristics. In this paper we use a purely data-driven…
The United States national forest inventory (NFI) serves as the foundation for forest aboveground biomass (AGB) and carbon accounting across the nation. These data enable design-based estimates of forest carbon stocks and stock-changes at…
Examining the relationship between vulnerability of the built environment and community recovery is crucial for understanding disaster resilience. Yet, this relationship is rather neglected in the existing literature due to previous…
An alternative approach for the panel second stage of data envelopment analysis (DEA) is presented in this paper. Instead of efficiency scores, we propose to model rankings in the second stage using a dynamic ranking model in the…
We propose a modeling procedure for estimating immediate responses to TV ads and evaluating the factors influencing their size. First, we capture diurnal and seasonal patterns of website visits using the kernel smoothing method. Second, we…
This paper investigates the stochastic behavior of an n-node blockchain which is continuously monitored and faces non-stop cyber attacks from multiple hackers. The blockchain will start being re-set once hacking is detected, forfeiting…
Group testing, a method that screens subjects in pooled samples rather than individually, has been employed as a cost-effective strategy for chlamydia screening among Iowa residents. In efforts to deepen our understanding of chlamydia…
Digital experimentation and measurement (DEM) capabilities -- the knowledge and tools necessary to run experiments with digital products, services, or experiences and measure their impact -- are fast becoming part of the standard toolkit of…
We study the impact of teenage sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and total…
Reducing Carbon dioxide (CO2) emission is vital at both global and national levels, given their significant role in exacerbating climate change. CO2 emission, stemming from a variety of industrial and economic activities, are major…
This study presents an approach to analyze health disparities in Sexual and Gender Minority (SGM) populations, with a focus on the role of social support levels as an example to allow causal interpretations of regression models. We advocate…
This paper introduces a Bayesian inference framework for two-dimensional steady-state heat conduction, focusing on the estimation of unknown distributed heat sources in a thermally-conducting medium with uniform conductivity. The goal is to…
Time series data from real-world systems often display non-stationary behavior, indicating varying statistical characteristics over time. This inherent variability poses significant challenges in deciphering the underlying structural…
Objective Hospitals register information in the electronic health records (EHR) continuously until discharge or death. As such, there is no censoring for in-hospital outcomes. We aimed to compare different dynamic regression modeling…