应用统计
Rapid resource model updating with real-time data is important for making timely decisions in resource management and mining operations. This requires optimal merging of models and observations, which can be achieved through data…
The interventional effects approach to causal mediation analysis is increasingly common in epidemiologic research, given its potential to address policy-relevant questions about hypothetical mediator interventions. Multiple imputation (MI)…
Land cover classification faces persistent challenges with inter-investigator variability and salt-and-pepper noise. Although cloud platforms such as Google Earth Engine have made land cover classification more accessible, these issues…
The present study aimed to solve the cure optimization problem of laminated composites through a statistical approach. The approach consisted of using constrained Bayesian Optimization (cBO) along with a Gaussian process model as a…
In a world of utility-driven marketing, each company acts as an adversary to other contenders, with all having competing interests. A major challenge for companies launching a new product is that, despite testing, flaws in their product can…
The impact of statistical methodologies on studying groundwater has been significant in the last several decades, due to cheaper computational abilities and presence of technologies that enable us to extract and measure more and more data.…
The frequency response function (FRF) is a typical way to describe the outcome of experiments where posture control is perturbed with an external stimulus. The FRF is an empirical transfer function between an input stimulus and the induced…
Many studies have examined social determinants of health (SDoH) independently, overlooking their interconnected nature. Our study uses a multidimensional approach to construct a neighborhood-level measure that explores how multiple SDoH…
The development of robust odor navigation strategies for automated environmental monitoring applications requires realistic simulations of odor time series for agents moving across large spatial scales. Traditional approaches that rely on…
Introduction: Substance use disorders (SUDs) have emerged as a pressing public health concern in the United States, with adolescent substance use often leading to SUDs in adulthood. Effective strategies are needed to stem this progression.…
Effective policy and intervention strategies to combat human trafficking for child sexual exploitation material (CSEM) production require accurate prevalence estimates. Traditional Network Scale Up Method (NSUM) models often necessitate…
When extreme weather events affect large areas, their regional to sub-continental spatial scale is important for their impacts. We propose a novel machine learning (ML) framework that integrates spatial extreme-value theory to model weather…
Analysing age-specific mortality, fertility, and migration patterns is a crucial task in demography with significant policy relevance. In practice, such analysis is challenging when studying a large number of subpopulations, due to small…
This work introduces an anonymization scheme for a corpus of texts to safeguard metadata from disclosure. It specifically aims to prevent large language models from identifying metadata associated with texts, thereby avoiding their…
Mission profiles cover the conditions that a component, e.g., an electronic component of a vehicle, is exposed to during its lifecycle. Currently, these profiles typically provide descriptive summaries, such as histograms, of single stress…
Vehicle telematics provides granular data for dynamic driving risk assessment, but current methods often rely on aggregated metrics (e.g., harsh braking counts) and do not fully exploit the rich time-series structure of telematics data. In…
In this article, we propose a non-parametric Bayesian level-set method for simultaneous reconstruction of two different piecewise constant coefficients in an elliptic partial differential equation. We show that the Bayesian formulation of…
We introduce the evolving categories multinomial (ECM) distribution for multivariate count data taken over time. This distribution models the counts of individuals following iid stochastic dynamics among categories, with the number and…
Sample size calculation is crucial in biomedical in vivo research investigations mainly for two reasons: to design the most resource-efficient studies and to safeguard ethical issues when alive animals are subjects of testing. In this…
Online A/B experiments generate millions of user-activity records each day, yet experimenters need timely forecasts to guide roll-outs and safeguard user experience. Motivated by the problem of activity prediction for A/B tests at Amazon,…