Statistics
Ensemble forecasts are commonly used to support decision-making and policy planning across various fields because they often offer improved accuracy and stability compared to individual models. As each model has its own unique…
The airborne fraction is the share of anthropogenic carbon dioxide emissions that remains in the atmosphere and is a key indicator of carbon-cycle response and remaining carbon budgets under continued emissions. Whether this share is rising…
Large language models (LLMs) are increasingly used in statistical research and applications. However,they are also notorious for unreliable or biased information. Here, we explore whether LLMs can be used to improve the precision of…
Latent position models (LPMs) are a large and popular class of models for random graphs. However, fitting Bayesian LPMs is computationally challenging - computing the likelihood even once takes time that is quadratic in the number of…
Contact (or mixing, or more generally connectivity) matrices are a fundamental component of modelling and inference for infectious disease epidemiology. Their structure and parametrisation directly accounts for the frequency of interactions…
A common challenge in data analysis is uncovering relationships between predictors and responses in problems involving large numbers of both. When the number of predictors and responses is limited, visual approaches are particularly…
The mean squared displacement (MSD) of particles or probes is commonly estimated from microscopy videos using particle tracking approaches, which rely on tuning parameters manually, and are often unstable over the entire lag time range,…
In demographic literature, forecast uncertainty is often quantified with a statistical model. This model-based approach may potentially suffer from drawbacks, namely model misspecification, selection effect, and lack of finite-sample…
Marine corrosion significantly reduces a ship's availability, increases costs of operation and could impact safety. Protective coatings mitigate these risks, but their effectiveness deteriorates over time. Early detection of coating…
Storage tanks for hazardous liquids are common in industry and agriculture. During a pollution incident, liquid may drain from a storage tank through a small hole, crack, or pipe. After containing the leak, estimating the discharged volume…
Spatial individual-level models (ILMs) provide a flexible framework for modelling infectious disease transmission across populations with known locations. Bayesian inference for these models relies on Markov chain Monte Carlo (MCMC), which…
Gaussian Process (GP) models provide a flexible framework for prediction and uncertainty quantification. For most covariance functions, however, exact GP prediction with $n$ points scales as $\mathcal{O}(n^3)$, making it prohibitively…
Clinical trials usually target average treatment effects, but treatment decisions are made for individuals. This tension motivates a common criticism of evidence-based medicine: a treatment that is beneficial on average may be inappropriate…
There is enduring interest in disentangling the effects of skill and luck in sport. A key issue in Formula 1 is distinguishing between car-level and driver-level effects. Four elite teams currently dominate Formula 1 and have won every…
This paper presents a novel approach to classical linear regression, enabling model computation from data streams or in a distributed setting while preserving data privacy in federated environments. We extend this framework to generalized…
Structural and practical parameter non-identifiability issues are common when mathematical models are used to interpret data. Such issues motivate model reparameterisation and reduction methods. Here, we consider Invariant Image…
We develop Microcanonical Hamiltonian Monte Carlo (MCHMC), a class of models which follow a fixed energy Hamiltonian dynamics, in contrast to Hamiltonian Monte Carlo (HMC), which follows canonical distribution with different energy levels.…
Digital health technologies enable high-frequency collection of data in near-continuous time and capture rich information about the health of individuals. The raw data collected by these devices often have a hierarchical functional…
Intra-physician prescribing variability, the probability that one physician issues discordant decisions for two patients deemed comparable on observed covariates, holds great impact in quality of care, safety and cost. However, there are no…
The Empirical Bayes (EB) procedure of Hauer et al. (2002) is the workhorse of highway safety analysis: it combines a Safety Performance Function with observed crash counts to produce shrinkage estimates of segment-level crash rates. EB…