应用统计
We will study the impact of adolescent sports participation on early-adulthood health using longitudinal data from the National Study of Youth and Religion. We focus on two primary outcomes measured at ages 23--28 -- self-rated health and…
We investigate the molecular gene expressions studies and public databases for disease modelling using Probabilistic Graphical Models and Bayesian Inference. A case study on Spinal Muscle Atrophy Genome-Wide Association Study results is…
The COVID-19 pandemic caused major disruptions and contributed to the loss of livelihoods and income. The pandemic also provided public health and health systems policy shifts towards better promotion and protection in responding to such…
Prediction intervals offer an effective tool for quantifying the uncertainty of loads in distribution systems. The traditional central PIs cannot adapt well to skewed distributions, and their offline training fashion is vulnerable to…
Wind power forecasting is essential to power system operation and electricity markets. As abundant data became available thanks to the deployment of measurement infrastructures and the democratization of meteorological modelling, extensive…
We compute bias, variance, and approximate confidence intervals for the efficiency of a random selection process under various special conditions that occur in practical data analysis. We consider the following cases: a) the number of…
With the rising popularity of Wordle, people have eagerly taken to Twitter to report their results daily by the tens of thousands. In this paper, we develop a comprehensive model which uses this data to predict Wordle player performance and…
Background: The debate over daylight saving time has surged, with interests in the effects of sunlight exposure on health. \commentnj{Prior studies simulated daylight saving time and standard time conditions by analyzing different locations…
We investigate estimation of causal effects of multiple competing (multi-valued) treatments in the absence of randomization. Our work is motivated by an intention-to-treat study of the relative cardiometabolic risk of assignment to one of…
In this paper we consider pricing of insurance contracts for breast cancer risk based on three multiple state models. Using population data in England and data from the medical literature, we calibrate a collection of semi-Markov and Markov…
The accurate prediction of patient prognosis is a critical challenge in clinical practice. With the availability of various patient information, physicians can optimize medical care by closely monitoring disease progression and therapy…
Emissions of nitric oxide and nitrogen dioxide, which are named as NOx, are a major environmental and health concern.To react to the climate crisis, the South Korean government has strengthened NOx emission regulations. An accurate NOx…
In the realm of medical research, the intricate interplay of epidemiological risk, genomic activity, adverse events, and clinical response necessitates a nuanced consideration of multiple variables. Clinical trials, designed to meticulously…
The last two centuries have seen a significant increase in life expectancy. Although past trends suggest that mortality will continue to decline in the future, uncertainty and instability about the development is greatly increased due to…
Persistent homology is a methodology central to topological data analysis that extracts and summarizes the topological features within a dataset as a persistence diagram; it has recently gained much popularity from its myriad successful…
The COVID-19 pandemic was responsible for the cancellation of both the men's and women's 2020 National Collegiate Athletic Association (NCAA) Division 1 basketball tournaments. Starting from the point at which the Division 1 tournaments and…
In cities, the creation of public transport infrastructure such as light rails can cause changes on a very detailed spatial scale, with different stories unfolding next to each other within a same urban neighborhood. We study the direct…
The present study investigates to what degree the common variance of the factor score predictor with the original factor, i.e., the determinacy coefficient or the validity of the factor score predictor, depends on the mean-difference…
Metric-based summary statistics such as mean and covariance have been introduced in neural spike train space. They can properly describe template and variability in spike train data, but are often sensitive to outliers and expensive to…
In the field of population health research, understanding the similarities between geographical areas and quantifying their shared effects on health outcomes is crucial. In this paper, we synthesise a number of existing methods to create a…