应用统计
We present D-BIRD, a Bayesian dynamic item response model for estimating student ability from sparse, longitudinal assessments. By decomposing ability into a cohort trend and individual trajectory, D-BIRD supports interpretable modeling of…
In many fields, populations of interest are hidden from data for a variety of reasons, though their magnitude remains important in determining resource allocation and appropriate policy. In public health and epidemiology, linkages or…
Obtaining accurate water level predictions are essential for water resource management and implementing flood mitigation strategies. Several data-driven models can be found in the literature. However, there has been limited research with…
We propose a statistical approach for estimating the mean line width in spectra comprising Lorentzian, Gaussian, or Voigt line shapes. Our approach uses Gaussian processes in two stages to jointly model a spectrum and its Fourier transform.…
Many important questions in infectious disease epidemiology involve the effects of covariates (e.g., age or vaccination status) on infectiousness and susceptibility, which can be measured in studies of transmission in households or other…
Structural Health Monitoring (SHM) plays a pivotal role in modern civil engineering, providing critical insights into the health and integrity of infrastructure systems. This work presents a novel multivariate long-term profile monitoring…
Climate is an evolving complex system with dynamic interactions and non-linear feedback mechanisms, shaping environmental and socio-economic outcomes. Crop production is highly sensitive to climatic fluctuations (and many other…
Modeling and forecasting air quality is crucial for effective air pollution management and protecting public health. Air quality data, characterized by nonlinearity, nonstationarity, and spatiotemporal correlations, often include extreme…
Programming reliability algorithms is crucial for risk assessment in geotechnical engineering. This study explores the possibility of automating and accelerating this task using Generative AI based on Large Language Models (LLMs).…
A variety of transparency initiatives have been introduced by governments to reduce corruption and allow citizens to independently evaluate effectiveness and efficiency of spending. In 2010, the UK government mandated transparency for many…
Principal component analysis (PCA) is a widely used unsupervised dimensionality reduction technique in machine learning, applied across various fields such as bioinformatics, computer vision and finance. However, when the response variables…
Increasing surface temperature could lead to enhanced evaporation, reduced soil moisture availability, and more frequent droughts and heat waves. The spatiotemporal co-occurrence of such effects further drives extreme anomalies in…
Ensuring robustness and resilience in intermodal transportation systems is essential for the continuity and reliability of global logistics. These systems are vulnerable to various disruptions, including natural disasters and technical…
Whether a variable is the cause of another, or simply associated with it, is often an important scientific question. Causal Inference is the name associated with the body of techniques for addressing that question in a statistical setting.…
Electronic devices exhibit changes in electrical resistance over time at varying rates, depending on the configuration of certain components. Since measuring overall electrical resistance requires partial disassembly, only a limited number…
One obstacle to ``elevating" correlation to causation is the phenomenon of confounding, i.e., when a correlation between two variables exists because both variables are in fact caused by a third variable. The situation where the confounders…
High-dimensional data are crucial in biomedical research. Integrating such data from multiple studies is a critical process that relies on the choice of advanced statistical models, enhancing statistical power, reproducibility, and…
The amount of high-dimensional large-scale RNA sequencing data derived from multiple heterogeneous sources has increased exponentially in biological science. During data collection, significant technical noise or errors may occur. To…
In flood disasters, decision-makers have to rapidly prioritise the areas that need assistance based on a high volume of information. While approaches that combine GIS with Bayesian networks are generally effective in integrating multiple…
Group-based trajectory modeling (GBTM) is commonly used to identify longitudinal patterns in health outcomes among older adults, with determining the optimal number of groups being a crucial step. While statistically grounded criteria are…