应用统计
Violence Against Women (VAW) is a widespread issue deeply rooted in social and cultural structures. Affecting women of all ages and backgrounds, VAW is often underreported due to stigma and victim-blaming. This study explores young people's…
Identifying the causal effects of socioeconomic determinants on population health is of many great interests - from statistical methodology development to public health practitioners and policy developments. The statistical side of the…
Forecasting accuracy is bounded by the information available about the future. This paper makes that statement precise using information-theoretic tools. Under logarithmic loss, the expected performance of any probabilistic forecast…
Accurate day-ahead electricity price forecasts are critical for power system operation and market participation, yet growing renewable penetration and recent crises have caused unprecedented volatility that challenges standard models. This…
Recent advances in large language model (LLM) embeddings have enabled powerful representations for biological data, but most applications to date focus on gene-level information. We present one of the first systematic frameworks to generate…
Modeling higher-order interactions (HOI) has emerged as a crucial challenge in complex systems analysis, as many phenomena cannot be fully captured by pairwise relationships alone. Hypergraphs, which generalize graphs by allowing…
The rivalry between two football superstars Cristiano Ronaldo and Lionel Messi has always been a subject of extensive discussion. This study aimed to compare the level of consistency between the two players in scoring goals through 6 ways:…
Bayesian inference often relies on Markov chain Monte Carlo (MCMC) methods, particularly required for non-Gaussian data families. When dealing with complex hierarchical models, the MCMC approach can be computationally demanding in workflows…
Forecasting is usually framed as a problem of model choice. This paper starts earlier, asking how much predictive information is available at each horizon. Under logarithmic loss, the answer is exact: the mutual information between the…
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition characterized by atypical brain connectivity. One of the crucial steps in addressing ASD is its early detection. This study introduces a novel computational framework that…
In professional sports analytics, evaluating the relationship between accumulated workload and injury risk is a central objective. However, naive survival models applied to NBA game-log data consistently yield a paradox: players who…
Spousal bereavement severely deteriorates mental health. While palliative care benefits dying patients, its "stress-buffering" effect on survivors' depression remains empirically elusive due to acute small-$N$ constraints in longitudinal…
This paper introduces \emph{biased mean regression}, estimating the \emph{biased mean}, i.e., $\mathbb{E}[Y] + x$, where $x \in \mathbb{R}$. The approach addresses a fundamental statistical problem that covers numerous applications. For…
Patients with metastatic breast cancer (mBC) undergo repeated computed tomography (CT) imaging during treatment to monitor disease progression. Accurate longitudinal tracking of individual lesions across scans from multiple radiologists is…
A generalised concept of the signal-to-noise ratio (or equivalently the ratio of predictable components, or RPC) is provided, based on proper scoring rules. This definition is the natural generalisation of the classical RPC, yet it allows…
In this paper we investigate the spatio-temporal dynamics of obesity rates across Italian regions from 2010 to 2022, aiming to identify spatial and temporal trends and assess potential heterogeneities. We implement a Bayesian hierarchical…
Criminal activity data are typically available via a three-way tensor encoding the reported frequencies of different crime categories across time and space. The challenges that arise in the design of interpretable, yet realistic,…
Racial disparities in healthcare expenditures are well-documented, yet the underlying drivers remain complex. This study develops a framework to decompose such disparities through shifts in the distributions of mediating variables, rather…
Obesity is widely recognized as a serious and pervasive health concern. We study obesity through body mass index (BMI), which is known to be highly heritable, and identify important genetic risk factors for BMI from hundreds of thousands of…
This study examines the impact of residential energy retrofits on household energy consumption in France using smart meter data from nearly 2,500 Hello Watt users, using a two-period difference-in-differences design. The dataset combines…