应用统计
This paper develops a logistic-aided Huber (LAH) M-estimator for robust GNSS positioning under long-tailed, multipath-affected measurement errors. The key idea is to leverage a logistic measurement error assumption and establish a…
Prior-Data Fitted Networks (PFNs) represent a paradigm shift in tabular data prediction. We present the principles of this new paradigm and evaluate two PFNs for estimating the average treatment effect (ATE) of a binary treatment on a…
Electronic health record (EHR)-linked biobank data hold tremendous promise for large-scale discoveries via genome-wide association study (GWAS) on diverse phenotypic traits and biomarkers routinely captured in the EHR. However,…
Improving statistical forecasts of tropical cyclone (TC) intensity is limited by complex nonlinear interactions and difficulty in identifying relevant predictors. Conventional methods prioritize correlation or fit, often overlooking…
The analysis of the transportation usage rate provides opportunities for evaluating the efficacy of the transportation service offered by proposing an indicator that integrates actual demand and capacity. This study aims to develop a…
The information about pavement surface type is rarely available in road network databases of developing countries although it represents a cornerstone of the design of efficient mobility systems. This research develops an automatic…
Climate change detection and attribution (D&A) is concerned with determining the extent to which anthropogenic activities have influenced specific aspects of the global climate system. D&A fits within the broader field of causal inference,…
Single-cell transcriptomic data approximates the abundance of proteins at a high resolution, but its noisiness necessitates transformation by a pipeline of methods before analysis and inference. In the absence of robust validation of these…
Understanding how the composition of guest origin markets evolves over time is critical for destination marketing organizations, hospitality businesses, and tourism planners. We develop and apply Bayesian Dirichlet autoregressive moving…
Market area models, such as the Huff model and its extensions, are widely used to estimate regional market shares and customer flows of retail and service locations. Another, now very common, area of application is the analysis of catchment…
The opioid epidemic remains a major public health challenge in the United States, requiring a multi-pronged intervention approach to mitigate harms to communities. Given the heterogeneity of the epidemic across the country, it is crucial…
A scale mixture of normals is a distribution formed by mixing a collection of normal distributions with fixed mean but different variances. A generalized gamma scale mixture draws the variances from a generalized gamma distribution.…
Prevalence surveys are routinely used to monitor the effectiveness of mass drug administration (MDA) programmes for controlling neglected tropical diseases (NTDs). We propose a decay-adjusted spatio-temporal (DAST) model that explicitly…
Observational studies often present challenges for causal inference due to confounding and heterogeneity. In this paper, we illustrate how modern causal inference methods can be applied to large-scale academic salary data. Using records…
Human aging is marked by a steady rise in the risk of dying with age-a process demographers call senescence. Over the past century, life expectancy has risen dramatically, but is this because we are aging slower, or simply starting it…
In interventional health studies, causal mediation analysis can be employed to investigate mechanisms through which the intervention affects the targeted health outcome. Identifying direct and indirect (i.e. mediated) effects from empirical…
Most point process models for earthquakes currently in the literature assume the magnitude distribution is i.i.d. potentially hindering the ability of the model to describe the main features of data sets containing multiple earthquake…
This paper extends the existing fractional Hawkes process to better model mainshock-aftershock sequences of earthquakes. The fractional Hawkes process is a self-exciting point process model with temporal decay kernel being a Mittag-Leffler…
In this work, we analyze 126 publicly available IAM climate scenarios modeled by six leading teams in climate science. We define a simple numerical metric that measures the decarbonization speed implied by each IAM scenario. With this…
The existence of an upper limit to the human lifespan has been widely debated, with studies offering both supporting and opposing evidence. Using unique individual-level death and population records for individuals aged 90 and older in…