应用统计
This paper explains why internal and external validity cannot be simultaneously maximised. It introduces "evidential states" to represent the information available for causal inference and shows that routine study operations (restriction,…
This study addresses the critical challenge of modeling and mapping urban air quality to ascertain pollutant concentrations in unmonitored locations. The advent of low-cost sensors, particularly those deployed in vehicular networks,…
Reliable and timely dengue predictions provide actionable lead time for targeted vector control and clinical preparedness, reducing preventable diseases and health-system costs in at-risk communities. Dengue forecasting often relies on…
Estimating the size of marginalized populations is a persistent challenge in survey statistics and public health, especially where stigma and legal restrictions exclude such groups from census and administrative data. Migrant domestic…
South Korea faces the dual challenge of managing growing distributed solar energy surpluses and the high energy demand of industries like Bitcoin mining. Leveraging mining operations as a flexible load to monetize this `net-metering…
Transportability, the ability to maintain performance across populations, is a desirable property of markers of clinical outcomes. However, empirical findings indicate that markers often exhibit varying performances across populations. For…
In this article, we explore how the escalating victimization of civilians during civil wars is mirrored in the fragmented distribution of territorial control, focusing on the Colombian armed conflict. Through an exhaustive characterization…
Geographic experiments are a widely-used methodology for measuring incremental return on ad spend (iROAS) at scale, yet their design presents significant challenges. The unit count is small, heterogeneity is large, and the optimal Supergeo…
Bayesian networks (BN) have advantages in visualizing causal relationships and performing probabilistic inference analysis, making them ideal tools for coastal hazard analysis and characterizing the compound mechanisms of coastal hazards.…
Healthcare data, particularly in critical care settings, presents three key challenges for analysis. First, physiological measurements come from different sources but are inherently related. Yet, traditional methods often treat each…
A central question in risk analysis is to identify the factors that drive the system toward a specific hazardous outcome, such as the exceedance of a given threshold. When relying on numerical simulators, we propose to study the…
Seat belt use remains one of the most effective measures for reducing vehicle occupant fatalities and injuries. Yet, seat-belt compliance across different locales demands far more granular data than traditional, roadside surveys can…
This paper discusses some statistical aspects of the U.K. Covid-19 pandemic response, focussing particularly on cases where we believe that a statistically questionable approach or presentation has had a substantial impact on public…
The effects of treatments on continuous outcomes can be estimated by the mean difference (i.e. by measurement units) and the relative effect scales (i.e. by percentages), both of which provide only a single effect size estimate over the…
Time-to-event models are commonly used to study associations between risk factors and disease outcomes in the setting of electronic health records (EHR). In recent years, focus has intensified on social determinants of health, highlighting…
In this study, we address the challenge of modelling the spatial variability in violence against women across municipalities in a Southern Italian region by proposing a Bayesian spatio-temporal Poisson regression model. Using data on access…
We study the spatio-temporal features of extremal sub-daily precipitation data over the Piave river basin in northeast Italy using a rich database of observed hourly rainfall. Empirical evidence suggests that both the marginal and…
Plate discipline is an important feature of a hitter's success. Hitter who are able to recognize good pitches to swing at and balls to take are generally recognized as disciplined hitters. Although there are some metrics that can provide…
Runs Batted IN (RBI) records the number of runs a hitter directly drives in during their plate appearances and reflects a batter's ability to convert opportunities into scoring. Because producing runs determines game outcomes, RBI has long…
Adverse childhood experiences (ACEs) are categories of childhood abuse, neglect, and household dysfunction. Screening by a single additive ACE score (e.g., a $\ge 4$ cutoff) has poor individual-level discrimination. We instead identify…