Related papers: Multiple imputation with missing data indicators

Random Indicator Imputation for Missing Not At Random Data

Imputation methods for dealing with incomplete data typically assume that the missingness mechanism is at random (MAR). These methods can also be applied to missing not at random (MNAR) situations, where the user specifies some adjustment…

Methodology · Statistics 2024-04-24 Shahab Jolani , Stef van Buuren

Multiple imputation of incomplete multilevel data using Heckman selection models

Missing data is a common problem in medical research, and is commonly addressed using multiple imputation. Although traditional imputation methods allow for valid statistical inference when data are missing at random (MAR), their…

Methodology · Statistics 2023-01-13 Johanna Muñoz , Matthias Egger , Orestis Efthimiou , Vincent Audigier , Valentijn M. T. de Jong , Thomas. P. A. Debray

Recursive Equations For Imputation Of Missing Not At Random Data With Sparse Pattern Support

A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume…

Methodology · Statistics 2025-07-23 Trung Phung , Kyle Reese , Ilya Shpitser , Rohit Bhattacharya

Multiple Imputation for Non-Monotone Missing Not at Random Binary Data using the No Self-Censoring Model

Although approaches for handling missing data from longitudinal studies are well-developed when the patterns of missingness are monotone, fewer methods are available for non-monotone missingness. Moreover, the conventional missing at random…

Methodology · Statistics 2023-02-28 Boyu Ren , Stuart R. Lipsitz , Roger D. Weiss , Garrett M. Fitzmaurice

Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the…

Methodology · Statistics 2022-06-27 Rose Sisk , Matthew Sperrin , Niels Peek , Maarten van Smeden , Glen P. Martin

Multiple imputation using dimension reduction techniques for high-dimensional data

Missing data present challenges in data analysis. Naive analyses such as complete-case and available-case analysis may introduce bias and loss of efficiency, and produce unreliable results. Multiple imputation (MI) is one of the most widely…

Methodology · Statistics 2019-05-15 Domonique W. Hodge , Sandra E. Safo , Qi Long

Maximum likelihood multiple imputation: Faster imputations and consistent standard errors without posterior draws

Multiple imputation (MI) is a method for repairing and analyzing data with missing values. MI replaces missing values with a sample of random values drawn from an imputation model. The most popular form of MI, which we call posterior draw…

Methodology · Statistics 2019-11-18 Paul T. von Hippel , Jonathan Bartlett

Population-calibrated multiple imputation for a binary/categorical covariate in categorical regression models

Multiple imputation (MI) has become popular for analyses with missing data in medical research. The standard implementation of MI is based on the assumption of data being missing at random (MAR). However, for missing data generated by…

Methodology · Statistics 2019-01-03 Tra My Pham , James R Carpenter , Tim P Morris , Angela M Wood , Irene Petersen

Optimized Linear Imputation

Often in real-world datasets, especially in high dimensional data, some feature values are missing. Since most data analysis and statistical methods do not handle gracefully missing values, the first step in the analysis requires the…

Machine Learning · Statistics 2016-12-08 Yehezkel S. Resheff , Daphna Weinshall

How to apply multiple imputation in propensity score matching with partially observed confounders: a simulation study and practical recommendations

Propensity score matching (PSM) has been widely used to mitigate confounding in observational studies, although complications arise when the covariates used to estimate the PS are only partially observed. Multiple imputation (MI) is a…

Applications · Statistics 2021-07-22 Albee Y. Ling , Maria E. Montez-Rath , Maya B. Mathur , Kris Kapphahn , Manisha Desai

Generative Conditional Missing Imputation Networks

In this study, we introduce a sophisticated generative conditional strategy designed to impute missing values within datasets, an area of considerable importance in statistical analysis. Specifically, we initially elucidate the theoretical…

Machine Learning · Statistics 2026-01-05 George Sun , Yi-Hui Zhou

Multiple Imputation: A Review of Practical and Theoretical Findings

Multiple imputation is a straightforward method for handling missing data in a principled fashion. This paper presents an overview of multiple imputation, including important theoretical results and their practical implications for…

Methodology · Statistics 2018-01-15 Jared S. Murray

Semi-Supervised Learning with Multiple Imputations on Non-Random Missing Labels

Semi-Supervised Learning (SSL) is implemented when algorithms are trained on both labeled and unlabeled data. This is a very common application of ML as it is unrealistic to obtain a fully labeled dataset. Researchers have tackled three…

Machine Learning · Computer Science 2023-08-16 Jason Lu , Michael Ma , Huaze Xu , Zixi Xu

General and Feasible Tests with Multiply-Imputed Datasets

Multiple imputation (MI) is a technique especially designed for handling missing data in public-use datasets. It allows analysts to perform incomplete-data inference straightforwardly by using several already imputed datasets released by…

Methodology · Statistics 2022-01-03 Kin Wai Chan

Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data

Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample…

Machine Learning · Computer Science 2022-12-23 Zongyu Dai , Zhiqi Bu , Qi Long

Multiple imputation of missing covariate values in multilevel models with random slopes: A cautionary note

Multiple imputation (MI) has become one of the main procedures used to treat missing data, but the guidelines from the methodological literature are not easily transferred to multilevel research. For models including random slopes, proper…

Methodology · Statistics 2016-06-30 Simon Grund , Oliver Lüdtke , Alexander Robitzsch

Identifiable Deep Latent Variable Models for MNAR Data

Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is…

Methodology · Statistics 2026-03-30 Huiming Xie , Fei Xue , Xiao Wang

Deep Generative Imputation Model for Missing Not At Random Data

Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the…

Machine Learning · Computer Science 2025-05-27 Jialei Chen , Yuanbo Xu , Pengyang Wang , Yongjian Yang

Sufficient Identification Conditions and Semiparametric Estimation under Missing Not at Random Mechanisms

Conducting valid statistical analyses is challenging in the presence of missing-not-at-random (MNAR) data, where the missingness mechanism is dependent on the missing values themselves even conditioned on the observed data. Here, we…

Methodology · Statistics 2023-06-13 Anna Guo , Jiwei Zhao , Razieh Nabi

Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data

Missing Not At Random (MNAR) values lead to significant biases in the data, since the probability of missingness depends on the unobserved values.They are ''not ignorable'' in the sense that they often require defining a model for the…

Statistics Theory · Mathematics 2020-06-11 Aude Sportisse , Claire Boyer , Julie Josse