应用统计
This study investigates monthly trends in New York City subway ridership throughout 2023 by integrating Metropolitan Transportation Authority (MTA) origin-destination data with weather data from Weather Underground. Using a longitudinal…
Multilevel regression and poststratification (MRP) is a computationally efficient indirect estimation method that can quickly produce improved population-adjusted estimates with limited data. Recent computational advancements allow…
Objective: Gun violence is a serious public health problem in the United States. The Gun Violence Archive (GVA) provides detailed geographic information, while the National Violent Death Reporting System (NVDRS) offers demographic,…
This paper introduces a novel framework for reducing variable selection bias by balancing selection frequencies of base-learners in boosting and introduces the sgboost package in R, which implements this framework combined with sparse-group…
The Susceptible-Infectious-Recovered (SIR) equations and their extensions comprise a commonly utilized set of models for understanding and predicting the course of an epidemic. In practice, it is of substantial interest to estimate the…
Clustering multimorbidity has been a global research priority in recent years. Existing studies usually identify these clusters using one of several popular clustering methods and then explore various characteristics of these clusters,…
Procurement in maritime logistics faces challenges due to uncertainties in demand and fluctuating market conditions. To address these complexities, we introduce a flexible discrete-event simulation framework that models the request-to-order…
Multiple sclerosis is a chronic autoimmune disease that affects the central nervous system. Understanding multiple sclerosis progression and identifying the implicated brain structures is crucial for personalized treatment decisions.…
Low-rank matrix factorization is a powerful tool for understanding the structure of 2-way data, and is usually accomplished by minimizing a sum of squares criterion. Expectile analysis generalizes squared-error loss by introducing…
A novel approach is developed for discovering directed connectivity between specified pairs of nodes in a high-dimensional network (HDN) of brain signals. To accurately identify causal connectivity for such specified objectives, it is…
The use of geospatially dependent information, which has been stipulated as a law in geography, to model geographic patterns forms the cornerstone of geostatistics, and has been inherited in many data science based techniques as well, such…
It is no secret that statistical modelling often involves making simplifying assumptions when attempting to study complex stochastic phenomena. Spatial modelling of extreme values is no exception, with one of the most common such…
Interpretable insights from predictive models remain critical in bio-statistics, particularly when assessing causality, where classical statistical and machine learning methods often provide inherent clarity. While Neural Networks (NNs)…
This study investigates topsoil contamination in Ireland using geochemical data from the Tellus Programme, analyzing 4,278 soil samples across 17,983 square kilometer. The research employs CPF clustering with spatial constraints to classify…
This study examines the feasibility and profitability of utilizing surplus electricity for Bitcoin mining. Surplus electricity refers to the remaining electricity after net metering, which can be repurposed for Bitcoin mining to improve…
Narwhals in the Arctic are increasingly exposed to human activities that can temporarily or permanently threaten their survival by modifying their behavior. We examine GPS data from a population of narwhals exposed to ship and seismic…
Urban Building Energy Models (UBEM) support urbanscale energy decisions and have recently been applied to use cases requiring dynamic outputs like grid management. However, their predictive capability remains insufficiently addressed,…
The delivery of drug samples allows increasing sales of pharmaceutical products [6]. However, we discovered some problems that can be improved in the supply chain that delivers drug samples (used for the treatment of excess glucose).…
Humans are able to communicate in sophisticated ways with only sparse signals, especially when cooperating. Two parallel theoretical perspectives on cooperative communication emphasize pragmatic reasoning and joint utility mechanisms to…
Accurate estimation of greenhouse gas (GHG) is essential to meet carbon neutrality targets, particularly through the calculation of direct CO2 emissions from electricity generation. This work reviews and compares emission factor-based…