应用统计
NBA team managers and owners try to acquire high-performing players. An important consideration in these decisions is how well the new players will perform in combination with their teammates. Our objective is to identify elite five-person…
The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from conventional approaches may not be…
The drug overdose crisis in the United States continues to intensify. Fatalities have increased five-fold since 1999 reaching a record high of 108,000 deaths in 2021. The epidemic has unfolded through distinct waves of different drug types,…
In this study, we delve into the dynamics of Wordle using data analysis and machine learning. Our analysis initially focused on the correlation between the date and the number of submitted results. Due to initial popularity bias, we modeled…
We evaluated the sensitivity of estimated PM2.5 and NO2 health impacts to varying key input parameters and assumptions including: 1) the spatial scale at which impacts are estimated, 2) using either a single concentration-response function…
Flooding is one of the most disruptive and costliest climate-related disasters and presents an escalating threat to population health due to climate change and urbanization patterns. Previous studies have investigated the consequences of…
Quarantine measure is a commonly used non-pharmaceutical intervention during the outbreak of infectious diseases. A key problem for implementing quarantine measure is to determine the duration of quarantine. In this paper, a policy with…
The recently published ICH E9 addendum on estimands in clinical trials provides a framework for precisely defining the treatment effect that is to be estimated, but says little about estimation methods. Here we report analyses of a clinical…
Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting…
Drought is a global threat caused by the persistent challenges of climate change. It is important to identify drought conditions based on the weather variables and their patterns. In this study, we enhanced the Standardized Precipitation…
In this study we investigate the heat load patterns in one building using multi-step forecasting model. We combine the Autoregressive models that use multiple eXogenous variables (ARX) with Seasonally adaptable Time of Week and Climate…
Speeding has been acknowledged as a critical determinant in increasing the risk of crashes and their resulting injury severities. This paper demonstrates that severe speeding-related crashes within the state of Pennsylvania have a spatial…
We introduce a three-step framework to determine at which pitches Major League batters should swing. Unlike traditional plate discipline metrics, which implicitly assume that all batters should always swing at (resp. take) pitches inside…
In this work, the goal is to estimate the abundance of an animal population using data coming from capture-recapture surveys. We leverage the prior knowledge about the population's structure to specify a parsimonious finite mixture model…
As a baseball game progresses, batters appear to perform better the more times they face a particular pitcher. The apparent drop-off in pitcher performance from one time through the order to the next, known as the Time Through the Order…
Traditional NBA player evaluation metrics are based on scoring differential or some pace-adjusted linear combination of box score statistics like points, rebounds, assists, etc. These measures treat performances with the outcome of the game…
This study explores the concept of prominence as a candidate trait, understood as the perceived worthiness of attention candidates elicit from regular citizens in the context of low information elections. It proposes two dimensions of…
The digital health industry has grown in popularity since the 2010s, but there has been limited analysis of the topics discussed in the field across academic disciplines. This study aims to analyze the research trends of digital…
Initial steps in statistical downscaling involve being able to compare observed data from regional climate models (RCMs). This prediction requires (1) regridding RCM output from their native grids and at differing spatial resolutions to a…
Most general population web surveys are based on online panels maintained by commercial survey agencies. However, survey agencies differ in their panel selection and management strategies. Little is known if these different strategies cause…