应用统计
In high-stakes ML applications such as fraud detection, medical diagnostics, and content moderation, practitioners rely on consensus-based approaches to control prediction quality. A particularly valuable technique -- {\delta}\delta…
In this paper, we propose a novel frequency-severity joint trip-level risk index that combines the frequency of abnormal driving patterns with a severity component reflecting how extreme such behavior is relative to a portfolio-level…
Classical Monte Carlo methods for pricing catastrophe insurance tail risk converge at order reciprocal root N, requiring large simulation budgets to resolve upper-tail percentiles of the loss distribution. This sample-sparsity problem can…
Scientific experimentation is largely driven by statistical hypothesis testing to determine significant differences in interventions. Traditionally, experimenters allocate samples uniformly between each intervention. However, such an…
The employment of peer supporter workers starting in 2018 was one of the interventions deployed by National Health Service England as part of its Hepatitis C virus (HCV) elimination plan. Peers are individuals with relevant lived experience…
Determining which organizations are more effective in implementing an intervention program is essential for theoretically and empirically characterizing exemplary practice and for intervening to enhance the capacity of ineffective ones. Yet…
Researchers increasingly use data on social and economic networks to study a range of social science questions, but releasing statistics derived from networks can raise significant privacy concerns. We show how to release network…
Dynamic PET kinetic modeling increasingly demands voxelwise uncertainty quantification and robust model selection. Yet total-body PET (TB-PET) data volumes make conventional Bayesian approaches, such as per-voxel MCMC, computationally…
Tropical deforestation and rural poverty are deeply intertwined, yet isolating the causal effect of income on forest loss remains challenging. We use the 2015 global vanilla price boom, triggered by food-industry shifts toward natural…
Mental health difficulties among elementary school students represent a growing public health concern in South Korea, yet analytical tools for identifying school-specific vulnerability patterns from item response data remain limited. We…
Gas gun and other shock compression experiments often produce shock wave velocity measurements that are linearly associated with particle velocity. Traditionally, this empirical relationship is quantified with a single Hugoniot curve that…
Huntington disease (HD) is a neurodegenerative disease with progressively worsening symptoms. Accurately modeling time to HD diagnosis is essential for clinical trial design. Langbehn's model, the CAG-Age Product (CAP) model, the Prognostic…
Accurate intraday forecasts are essential for power system operations, complementing day-ahead forecasts that gradually lose relevance as new information becomes available. This paper introduces a Bayesian updating mechanism that converts…
Per- and polyfluoroalkyl substances (PFAS) are persistent environmental pollutants of major public health concern due to their resistance to degradation, widespread presence, and potential health risks. Analyzing PFAS in groundwater is…
Crop rotation impacts on soil nutrients are typically assessed using field-averaged or single-nutrient analyses that ignore spatial heterogeneity and multivariate interactions. We propose a multivariate lattice model treating soil as a 4D…
Surface temperature is a fundamental Essential Climate Variable, serving as a primary indicator of climate change and exerting a profound influence on ecosystems, agriculture, and human livelihoods. Although existing research provides a…
Diesel engine particulate matter (PM) is one of the most challenging emission constituents to predict. As engines become cleaner and emissions levels drop, manufacturers need reliable methods to quantify the PM generated by production…
An enhancement in seismic measuring instrumentation has been proven to have implications in the quantity of observed earthquakes, since denser networks usually allow recording more events. However, phenomena such as strong earthquakes or…
Compositional data, such as regional shares of economic sectors or property transactions, are central to understanding structural change in economic systems across space and time. This paper introduces a spatiotemporal multivariate…
New and existing methods for generating, and especially detecting, deepfakes are investigated and compared on the simple problem of authenticating coin flip data. Importantly, an alternative approach to deepfake generation and detection,…