Related papers: An efficient multiple imputation algorithm for con…

A monotone data augmentation algorithm for multivariate nonnormal data: with applications to controlled imputations for longitudinal trials

An efficient monotone data augmentation (MDA) algorithm is proposed for missing data imputation for incomplete multivariate nonnormal data that may contain variables of different types, and are modeled by a sequence of regression models…

Methodology · Statistics 2018-11-21 Yongqiang Tang

How to apply multiple imputation in propensity score matching with partially observed confounders: a simulation study and practical recommendations

Propensity score matching (PSM) has been widely used to mitigate confounding in observational studies, although complications arise when the covariates used to estimate the PS are only partially observed. Multiple imputation (MI) is a…

Applications · Statistics 2021-07-22 Albee Y. Ling , Maria E. Montez-Rath , Maya B. Mathur , Kris Kapphahn , Manisha Desai

Addressing missing data mechanism uncertainty using multiple-model multiple imputation: Application to a longitudinal clinical trial

We present a framework for generating multiple imputations for continuous data when the missing data mechanism is unknown. Imputations are generated from more than one imputation model in order to incorporate uncertainty regarding the…

Applications · Statistics 2013-01-14 Juned Siddique , Ofer Harel , Catherine M. Crespi

Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the…

Methodology · Statistics 2022-06-27 Rose Sisk , Matthew Sperrin , Niels Peek , Maarten van Smeden , Glen P. Martin

Improving the mixed model for repeated measures to robustly increase precision in randomized trials

In randomized trials, repeated measures of the outcome are routinely collected. The mixed model for repeated measures (MMRM) leverages the information from these repeated outcome measures, and is often used for the primary analysis to…

Methodology · Statistics 2023-07-20 Bingkai Wang , Yu Du

Missing Data and Prediction

Missing data are a common problem for both the construction and implementation of a prediction algorithm. Pattern mixture kernel submodels (PMKS) - a series of submodels for every missing data pattern that are fit using only data from that…

Methodology · Statistics 2017-04-27 Sarah Fletcher Mercaldo , Jeffrey D. Blume

Progression models for repeated measures: Estimating novel treatment effects in progressive diseases

Mixed Models for Repeated Measures (MMRMs) are ubiquitous when analyzing outcomes of clinical trials. However, the linearity of the fixed-effect structure in these models largely restrict their use to estimating treatment effects that are…

Methodology · Statistics 2023-01-23 Lars Lau Raket

Comparison of Parametric versus Machine-learning Multiple Imputation in Clinical Trials with Missing Continuous Outcomes

The use of flexible machine-learning (ML) models to generate imputations of missing data within the framework of Multiple Imputation (MI) has recently gained traction, particularly in observational settings. For randomised controlled trials…

Methodology · Statistics 2025-10-07 Mia S. Tackney , Jonathan W. Bartlett , Elizabeth Williamson , Kim May Lee

Outcome-Assisted Multiple Imputation of Missing Treatments

We provide guidance on multiple imputation of missing at random treatments in observational studies. Specifically, analysts should account for both covariates and outcomes, i.e., not just use propensity scores, when imputing the missing…

Methodology · Statistics 2025-01-23 Joseph Feldman , Jerome P. Reiter

Semiparametric fractional imputation using Gaussian mixture models for handling multivariate missing data

Item nonresponse is frequently encountered in practice. Ignoring missing data can lose efficiency and lead to misleading inference. Fractional imputation is a frequentist approach of imputation for handling missing data. However, the…

Methodology · Statistics 2018-09-18 Hejian Sang , Jae Kwang Kim

Linear mixed models to handle missing at random data in trial-based economic evaluations

Trial-based cost-effectiveness analyses (CEAs) are an important source of evidence in the assessment of health interventions. In these studies, cost and effectiveness outcomes are commonly measured at multiple time points, but some…

Methodology · Statistics 2022-03-30 Andrea Gabrio , Catrin Plumpton , Sube Banerjee , Baptiste Leurent

Handling missing data in model-based clustering

Gaussian Mixture models (GMMs) are a powerful tool for clustering, classification and density estimation when clustering structures are embedded in the data. The presence of missing values can largely impact the GMMs estimation process,…

Machine Learning · Statistics 2020-06-05 Alessio Serafini , Thomas Brendan Murphy , Luca Scrucca

Recursive Equations For Imputation Of Missing Not At Random Data With Sparse Pattern Support

A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume…

Methodology · Statistics 2025-07-23 Trung Phung , Kyle Reese , Ilya Shpitser , Rohit Bhattacharya

Compatibility of Missing Data Handling Methods across the Stages of Producing Clinical Prediction Models

Missing data is a challenge when developing, validating and deploying clinical prediction models (CPMs). Traditionally, decisions concerning missing data handling during CPM development and validation havent accounted for whether…

Methodology · Statistics 2026-02-04 Antonia Tsvetanova , Matthew Sperrin , David A. Jenkins , Niels Peek , Iain Buchan , Stephanie Hyland , Marcus Taylor , Angela Wood , Richard D. Riley , Glen P. Martin

A Novel Multiple Imputation Approach For Parameter Estimation in Observation-Driven Time Series Models With Missing Data

Handling missing data in time series is a complex problem due to the presence of temporal dependence. General-purpose imputation methods, while widely used, often distort key statistical properties of the data, such as variance and…

Methodology · Statistics 2026-03-18 Guilherme Pumi , Taiane Schaedler Prass , Douglas Krauthein Verdum

Multiple imputation using dimension reduction techniques for high-dimensional data

Missing data present challenges in data analysis. Naive analyses such as complete-case and available-case analysis may introduce bias and loss of efficiency, and produce unreliable results. Multiple imputation (MI) is one of the most widely…

Methodology · Statistics 2019-05-15 Domonique W. Hodge , Sandra E. Safo , Qi Long

Population-calibrated multiple imputation for a binary/categorical covariate in categorical regression models

Multiple imputation (MI) has become popular for analyses with missing data in medical research. The standard implementation of MI is based on the assumption of data being missing at random (MAR). However, for missing data generated by…

Methodology · Statistics 2019-01-03 Tra My Pham , James R Carpenter , Tim P Morris , Angela M Wood , Irene Petersen

Multiple imputation with missing data indicators

Multiple imputation is a well-established general technique for analyzing data with missing values. A convenient way to implement multiple imputation is sequential regression multiple imputation (SRMI), also called chained equations…

Methodology · Statistics 2021-03-04 Lauren J Beesley , Irina Bondarenko , Michael R Elliott , Allison W Kurian , Steven J Katz , Jeremy M G Taylor

Multiple imputation in data that grow over time: A comparison of three strategies

Multiple imputation is a highly recommended technique to deal with missing data, but the application to longitudinal datasets can be done in multiple ways. When a new wave of longitudinal data arrives, we can treat the combined data of…

Methodology · Statistics 2026-05-18 X. M. Kavelaars , S. van Buuren , J. R. van Ginkel

A monotone data augmentation algorithm for longitudinal data analysis via multivariate skew-t, skew-normal or t distributions

The mixed effects model for repeated measures (MMRM) has been widely used for the analysis of longitudinal clinical data collected at a number of fixed time points. We propose a robust extension of the MMRM for skewed and heavy-tailed data…

Methodology · Statistics 2019-08-12 Yongqiang Tang