应用统计
Psychological research often focuses on examining group differences in a set of numeric variables for which normality is doubtful. Longitudinal studies enable the investigation of developmental trends. For instance, a recent study…
Given the growing prevalence of diabetes, there has been significant interest in determining how diabetes affects instrumental daily functions, like driving. Complication of glucose control in diabetes includes hypoglycemic and…
Major Depressive Disorder (MDD) is one of the most common causes of disability worldwide. Unfortunately, about one-third of patients do not benefit sufficiently from available treatments and not many new drugs have been developed in this…
Sewer pipe network systems are an important part of civil infrastructure, and in order to find a good trade-off between maintenance costs and system performance, reliable sewer pipe degradation models are essential. In this paper, we…
Civil registration and vital statistics (CRVS) systems should be the primary source of mortality data for governments. Accurate and timely measurement of the completeness of death registration helps inform interventions to improve CRVS…
Classifying wine as "good" is a challenging task due to the absence of a clear criterion. Nevertheless, an accurate prediction of wine quality can be valuable in the certification phase. Previously, wine quality was evaluated solely by…
The development of data acquisition systems is facilitating the collection of data that are apt to be modelled as functional data. In some applications, the interest lies in the identification of significant differences in group functional…
Multivariate spatial modeling is key to understanding the behavior of materials downstream in a mining operation. The ore recovery depends on the mineralogical composition, which needs to be properly captured by the model to allow for good…
Case studies are typically used to teach 'ethics', but when the content of a course is focused on formulae and proofs, a case analysis and the knowledge, skills, and abilities they require can be distracting. Moreover, case analyses are…
Previous statistical approaches to hierarchical clustering for social network analysis all construct an "ultrametric" hierarchy. While the assumption of ultrametricity has been discussed and studied in the phylogenetics literature, it has…
Quantifying the number of molecules from fluorescence microscopy measurements is an important topic in cell biology and medical research. In this work, we present a consecutive algorithm for super-resolution (STED) scanning microscopy that…
School value-added models are widely applied to study, monitor, and hold schools to account for school differences in student learning. The traditional model is a mixed-effects linear regression of student current achievement on student…
The Paris Memorandum of Understanding on Port State Control (Paris MoU) is responsible for controlling substandard shipping in European waters and, consequently, increasing the standards of safety, pollution prevention, and onboard living…
Time series forecasting has been a quintessential topic in data science, but traditionally, forecasting models have relied on extensive historical data. In this paper, we address a practical question: How much recent historical data is…
Prediction rule ensembles (PREs) are a relatively new statistical learning method, which aim to strike a balance between predictive accuracy and interpretability. Starting from a decision tree ensemble, like a boosted tree ensemble or a…
Respondent-driven sampling (RDS) is both a sampling strategy and an estimation method. It is commonly used to study individuals that are difficult to access with standard sampling techniques. As with any sampling strategy, RDS has…
In this paper we review the academic transportation literature published between 2014 and 2018 to evaluate where the field stands regarding the use and misuse of statistical significance in empirical analysis, with a focus on discrete…
Chromates are widely used for their anticorrosive properties. Unfortunately, they are highly hazardous with environmental agencies regulating their levels to below 10 ppb in drinking water. As anion exchange resins are typically used for…
We first consider the method of scoring students' self-assessment of confidence (SAC) used by Foster in [1], and find that with it reporting their true confidence is not the optimal strategy for students. We then identify all continuously…
$\textbf{Objective:}$ High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA)…