应用统计
The Check-All-That-Apply (CATA) method was compared to the Adapted-Pivot-Test (APT) method, a recently published method based on pair comparisons between a coded wine and a reference sample, called pivot, and using a set list of attributes…
In applications where the study data are collected within cluster units (e.g., patients within transplant centers), it is often of interest to estimate and perform inference on the treatment effects of the cluster units. However, it is…
Applying simple linear regression models, an economist analysed a published dataset from an influential annual ranking in 2016 and 2017 of consumer outlets for Dutch New Herring and concluded that the ranking was manipulated. His finding…
Incrementality experiments compare customers exposed to a marketing action designed to increase sales to those randomly assigned to a control group. These experiments suffer from noisy responses which make precise estimation of the average…
We describe our experience in developing a predictive model that placed high position in the BigDeal Challenge 2022, an energy competition of load and peak forecasting. We present a novel procedure for feature engineering and feature…
Assessing advancements of technology is essential for creating science and technology policies and making informed investments in the technology market. However, current methods primarily focus on the characteristics of the technologies…
Many large and small companies in the tech and startup sector have been laying off an unusually high number of workers in 2022 and 2023. We are interested in predicting when this period of layoffs might end, without resorting to economic…
Visualization is an essential operation when assessing the risk of rare events such as coastal or river floodings. The goal is to display a few prototype events that best represent the probability law of the observed phenomenon, a task…
In a paper recently published in this journal, van Marle et al. (van Marle et al., 2022) introduce an interesting new data set for land use and land cover change CO2 emissions (LULCC) that they use to study whether a trend is present in the…
Objective. To provide step-by-step guidance and STATA and R code for using propensity score (PS) weighting to estimate moderation effects. Research Design. Tutorial illustrating the key steps for estimating and testing moderation using…
Questionable research practices like HARKing or p-hacking have generated considerable recent interest throughout and beyond the scientific community. We subsume such practices involving secret data snooping that influences subsequent…
In the current study, a brand-new SINARS(1) model is proposed for stationary discrete time series defined on $\boldsymbol{Z}$, based on extended binomial distribution and the Pegram's operator. The model effectively characterizes the series…
An ongoing "reproducibility crisis" calls into question scientific discoveries across a variety of disciplines ranging from life to social sciences. Replication studies aim to investigate the validity of findings in published research, and…
Motivated by the pressing need for suicide prevention through improving behavioral healthcare, we use medical claims data to study the risk of subsequent suicide attempts for patients who were hospitalized due to suicide attempts and later…
Ecological momentary assessment (EMA) data have a broad base of application in the study of time trends and relations. In EMA studies, there are a number of design considerations which influence the analysis of the data. One general…
Timely pre- and post-diagnosis check-ups are critical for cancer patients, across all cancer types, as these often lead to better outcomes. Several socio-demographic properties have been identified as strongly connected with both cancer's…
This paper proposes two practical implementations of Four-Dimensional Variational (4D-Var) Ensemble Kalman Filter (4D-EnKF) methods for non-linear data assimilation. Our formulations' main idea is to avoid the intrinsic need for adjoint…
Experimental designs with hierarchically-structured errors are pervasive in many biomedical areas; it is important to take into account this hierarchical architecture in order to account for the dispersion and make reliable inferences from…
Introduction. The societal burden of cognitive impairments in China has prompted researchers to develop clinical prediction models aimed at making risk assessments that enable preventative interventions. However, it is unclear which risk…
Ridesplitting -- a type of ride-hailing in which riders share vehicles with other riders -- has become a common travel mode in some major cities. This type of shared ride option is currently provided by transportation network companies…