应用统计
The goal of this paper is to develop a multilingual classifier and conditional probability estimator of occupation codes for online job advertisements in accordance with the International Standard Classification of Occupations (ISCO)…
Coral reefs are increasingly subjected to major disturbances threatening the health of marine ecosystems. Substantial research underway to develop intervention strategies that assist reefs in recovery from, and resistance to, inevitable…
While progress has been made in identifying common genetic variants associated with human diseases, for most of common complex diseases, the identified genetic variants only account for a small proportion of heritability. Challenges remain…
While ride-hailing services offer increased travel flexibility and convenience, persistent nighttime safety concerns significantly reduce women's willingness to use them. Existing research often treats women as a homogeneous group,…
In many sports, it is commonly believed that the home team has an advantage over the visiting team, known as the home field advantage. Yet its causal effect on team performance is largely unknown. In this paper, we propose a novel causal…
Climate change poses substantial risks to the global economy. Kotz, Levermann and Wenz (Nature, 2024) statistically analyzed economic and climate data, finding significant projected damages until mid-century and a divergence in outcomes…
We present a maximum entropy modeling framework for unimodal time series: signals that begin at a reference level, rise to a single peak, and return. Such patterns are commonly observed in ecological collapse, population dynamics, and…
Wang et al. (2025) use statistics to argue that sex at birth is not a biological coin toss, by noticing that repeated patterns such as Male Male Male and Female Female Female occur in the Nurses Health Study more often than patterns like…
Accurately modeling crash severity on rural two-lane roads is essential for effective safety management, yet standard single level approaches often overlook unobserved heterogeneity across road segments. In this study, we analyze 19 956…
Long-running clinical trials offer a unique opportunity to study disease progression and treatment response over time, enabling questions about how and when interventions alter patient trajectories. However, drawing causal conclusions in…
Recent advances in wearable technology have enabled the continuous monitoring of vital physiological signals, essential for predictive modeling and early detection of extreme physiological events. Among these physiological signals, heart…
Significant events, such as volcanic eruptions, can have global and long-lasting impacts on climate. These global impacts, however, are not uniform across space and time. Understanding how the Mt. Pinatubo eruption affects global and…
The burden of diabetes has disproportionately impacted Hispanic/Latino residents in the United States, with diet recognized as a major modifiable risk factor. Outcome-dependent dietary patterns provide insight into what foods may be…
This paper shows how measures of uncertainty can be applied to existing population forecasts using Estonia as a case study. The measures of forecast uncertainty are relatively easy to calculate and meet several important criteria used by…
The Area Under the ROC Curve (AUC) is a widely used performance metric for binary classifiers. However, as a global ranking statistic, the AUC aggregates model behavior over the entire dataset, masking localized weaknesses in specific…
This study predicts hourly solar irradiance components, Global Horizontal Irradiance (GHI), Direct Normal Irradiance (DNI), and Diffuse Horizontal Irradiance (DHI) using meteorological data to forecast solar energy output in Ibadan,…
The growing number of infectious disease outbreaks, like the one caused by the SARS-CoV-2 virus, underscores the necessity of actuarial models that can adapt to epidemic-driven risks. Traditional life insurance frameworks often rely on…
The presence or absence of winner-loser effects is a widely discussed phenomenon across both sports and psychology research. Investigation of such effects is often hampered by the limited availability of data. Online chess has exploded in…
We introduce a new approach in which several advanced large language models-specifically GPT-4-0125-preview, Meta-LLAMA-3-70B-Instruct, Claude-3-Opus, and Gemini-1.5-Flash-collaborate to both produce and answer intricate, doctoral-level…
Political actors often manipulate redistricting plans to gain electoral advantages, a process known as gerrymandering. Several states have implemented institutional reforms to address this problem, such as establishing map-drawing…