应用统计
Objective: This work introduces a framework for multivariate time series analysis aimed at detecting and quantifying collective emerging behaviors in the dynamics of physiological networks. Methods: Given a network system mapped by a vector…
Data-driven research is becoming a new paradigm in transportation, but the natural lack of individual socio-economic attributes in transportation data makes research such as activity purpose inference and mobility pattern identification…
Cause-of-death data is fundamental for understanding population health trends and inequalities as well as designing and evaluating public health interventions. A significant proportion of global deaths, particularly in low- and…
The optimization of expensive black-box simulators arises in a myriad of modern scientific and engineering applications. Bayesian optimization provides an appealing solution, by leveraging a fitted surrogate model to guide the selection of…
Modern recording techniques enable neuroscientists to simultaneously study neural activity across large populations of neurons, with capturing predictor-dependent correlations being a fundamental challenge in neuroscience. Moreover, the…
The ability to make accurate predictions with quantified uncertainty provides a crucial foundation for the successful management of a geothermal reservoir. Conventional approaches for making predictions using geothermal reservoir models…
The paper investigates the escalating concerns surrounding the surge in diabetes cases, exacerbated by the COVID-19 pandemic, and the subsequent strain on medical resources. The research aims to construct a predictive model quantifying…
Most contemporary mortality models rely on extrapolating trends or past events. However, population dynamics will be significantly impacted by climate change, notably the influence of temperatures on mortality. In this paper, we introduce a…
Ensuring product quality is critical to combating the global challenge of substandard and falsified medical products. Post-marketing surveillance is a central quality-assurance activity in which products from consumer-facing locations are…
Substandard and falsified pharmaceuticals, prevalent in low- and middle-income countries, substantially increase levels of morbidity, mortality and drug resistance. Regulatory agencies combat this problem using post-market surveillance by…
The standard mathematical approach to fourth-down decision making in American football is to make the decision that maximizes estimated win probability. Win probability estimates arise from machine learning models fit from historical data.…
Sparse model estimation is a topic of high importance in modern data analysis due to the increasing availability of data sets with a large number of variables. Another common problem in applied statistics is the presence of outliers in the…
The evaluation of generative or discriminative large language model (LLM)-based systems is often a complex multi-dimensional problem. Typically, a set of system configuration alternatives are evaluated on one or more benchmark datasets,…
Measurement system analysis aims to quantify the variability in data attributable to the measurement system and evaluate its contribution to overall data variability. This paper conducts a rigorous theoretical investigation of the…
This work aims to combine two primary meteorological data sources in the Philippines: data from a sparse network of weather stations and outcomes of a numerical weather prediction model. To this end, we propose a data fusion model which is…
We propose a new methodology to simulate the discounted penalty applied to a wind-farm operator by violating ramp-rate limitation policies. It is assumed that the operator manages a wind turbine plugged into a battery, which either provides…
Errors-in-variables curves are curves where errors exist not only in the independent variable but also in the dependent variable. We address the challenge of constructing simultaneous confidence bands (SCBs) for such curves. Our method…
Research on modeling the distributional aspects in sensor-based digital health (sDHT) data has grown significantly in recent years. Most existing approaches focus on using individual-specific density or quantile functions. However, there…
This paper proposes a Workflow for Assessing Treatment effeCt Heterogeneity (WATCH) in clinical drug development targeted at clinical trial sponsors. WATCH is designed to address the challenges of investigating treatment effect…
This study develops a Bayesian hierarchical model to explore the effects of air pollution on respiratory and cardiovascular mortality in Los Angeles County. The model takes into account various pollutants such as PM2.5, PM10, CO, SO2, NO2…