Statistics
Ensemble forecasts are commonly used to support decision-making and policy planning across various fields because they often offer improved accuracy and stability compared to individual models. As each model has its own unique…
Latent position models (LPMs) are a large and popular class of models for random graphs. However, fitting Bayesian LPMs is computationally challenging - computing the likelihood even once takes time that is quadratic in the number of…
A common challenge in data analysis is uncovering relationships between predictors and responses in problems involving large numbers of both. When the number of predictors and responses is limited, visual approaches are particularly…
Spatial individual-level models (ILMs) provide a flexible framework for modelling infectious disease transmission across populations with known locations. Bayesian inference for these models relies on Markov chain Monte Carlo (MCMC), which…
Gaussian Process (GP) models provide a flexible framework for prediction and uncertainty quantification. For most covariance functions, however, exact GP prediction with $n$ points scales as $\mathcal{O}(n^3)$, making it prohibitively…
This paper presents a novel approach to classical linear regression, enabling model computation from data streams or in a distributed setting while preserving data privacy in federated environments. We extend this framework to generalized…
We develop Microcanonical Hamiltonian Monte Carlo (MCHMC), a class of models which follow a fixed energy Hamiltonian dynamics, in contrast to Hamiltonian Monte Carlo (HMC), which follows canonical distribution with different energy levels.…
The use of dual system estimation (DSE) is heavily used in Census Bureau operations. With DSE methods, it is important to implement methods to infer the population size among those with missing data from one or both data sources. The use of…
Estimating equations arise in a wide range of statistical applications, including longitudinal and clustered data analysis, survival analysis, econometrics, and semiparametric inference. In high-dimensional settings, adding…
This paper proposes the quantile unit-log-symmetric autoregressive moving average (QULS--ARMA) model for bounded time series on the open unit interval $(0,1)$. The model extends the unit-log-symmetric family by introducing a quantile-based…
We propose a novel kinetic Langevin sampler based on a specific splitting scheme using the exact harmonic Langevin integrator. For strongly log-concave target measures, the sampler exploits a decomposition of the strongly convex potential…
Data assimilation combines dynamical models with observations to improve state estimates. Ensemble filters sequentially assimilate observations by updating a set of samples over time, alternating between a forecast and an analysis step.…
Stochastic simulators are increasingly used to expand the frontier of scientific knowledge and inform decision-making across real-world contexts. Simulator calibration, a process by which internal model inputs are tuned to match some…
Estimating the probabilities of rare failure events is a key challenge in the reliability analysis of physical systems. Subset simulation (SS) is a very popular adaptive Monte Carlo method for this problem. In SS, the small failure…
We introduce a Hamiltonian Monte Carlo (HMC) methodology based on a randomized selection of integration times, referred to as eHMC, where "e" stands for empirical. The approach relies on an offline calibration phase that leverages…
The moveEZ (pronounced move easy) R package provides tools for constructing animated PCA biplots that reveal how multivariate structure evolves across the ordered levels of a categorical variable. Built as an extension to the biplotEZ…
This note provides a lightweight tutorial on using Eigen, a C++ template library for linear algebra, to implement statistical and machine learning algorithms. The emphasis is practical rather than methodological: we show how common matrix…
Multiproposal MCMC (MP-MCMC) algorithms use clouds of proposals to efficiently traverse state spaces and overcome complex target geometries. While MCMC methods are embarrassingly parallel by nature, the non-trivial forms of parallelism…
State-space models (SSMs) are powerful probabilistic tools for modeling time-varying systems with latent dynamics. Inference in SSMs involves the estimation of latent states and parameters. In this work, we focus on parameter inference,…
Scientific computer simulations cannot represent all scales in realistic applications. To bridge this model-data gap, parameters are injected into models and constrained with noisy data using Bayesian inversion. To reduce the number of…