Related papers: Multiple Imputation Methods under Extreme Values

Do we Need Dozens of Methods for Real World Missing Value Imputation?

Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. While many studies compare imputation approaches, they…

Computation · Statistics 2025-11-10 Krystyna Grzesiak , Christophe Muller , Julie Josse , Jeffrey Näf

In-Database Data Imputation

Missing data is a widespread problem in many domains, creating challenges in data analysis and decision making. Traditional techniques for dealing with missing data, such as excluding incomplete records or imputing simple estimates (e.g.,…

Databases · Computer Science 2024-01-09 Massimo Perini , Milos Nikolic

Missing Value Estimation using Clustering and Deep Learning within Multiple Imputation Framework

Missing values in tabular data restrict the use and performance of machine learning, requiring the imputation of missing values. The most popular imputation algorithm is arguably multiple imputations using chains of equations (MICE), which…

Machine Learning · Computer Science 2022-03-01 Manar D Samad , Sakib Abrar , Norou Diawara

A Comparative Study of Imputation Methods for Multivariate Ordinal Data

Missing data remains a very common problem in large datasets, including survey and census data containing many ordinal responses, such as political polls and opinion surveys. Multiple imputation (MI) is usually the go-to approach for…

Methodology · Statistics 2024-12-25 Chayut Wongkamthong , Olanrewaju Akande

Are deep learning models superior for missing data imputation in large surveys? Evidence from an empirical comparison

Multiple imputation (MI) is a popular approach for dealing with missing data arising from non-response in sample surveys. Multiple imputation by chained equations (MICE) is one of the most widely used MI algorithms for multivariate data,…

Machine Learning · Computer Science 2022-03-22 Zhenhua Wang , Olanrewaju Akande , Jason Poulos , Fan Li

On the Performance of Imputation Techniques for Missing Values on Healthcare Datasets

Missing values or data is one popular characteristic of real-world datasets, especially healthcare data. This could be frustrating when using machine learning algorithms on such datasets, simply because most machine learning models perform…

Machine Learning · Computer Science 2024-03-25 Luke Oluwaseye Joel , Wesley Doorsamy , Babu Sena Paul

Recursive Equations For Imputation Of Missing Not At Random Data With Sparse Pattern Support

A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume…

Methodology · Statistics 2025-07-23 Trung Phung , Kyle Reese , Ilya Shpitser , Rohit Bhattacharya

A stacked approach for chained equations multiple imputation incorporating the substantive model

Multiple imputation by chained equations (MICE) has emerged as a popular approach for handling missing data. A central challenge for applying MICE is determining how to incorporate outcome information into covariate imputation models,…

Methodology · Statistics 2019-10-11 Lauren Beesley , Jeremy M G Taylor

Variational Bayesian Multiple Imputation in High-Dimensional Regression Models With Missing Responses

Multiple imputation has become one of the standard methods in drawing inferences in many incomplete data applications. Applications of multiple imputation in relatively more complex settings, such as high-dimensional clustered data, require…

Methodology · Statistics 2025-04-08 Qiushuang Li , Recai Yucel

Imputation of missing data using multivariate Gaussian Linear Cluster-Weighted Modeling

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Methodology · Statistics 2023-08-15 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Solving the "many variables" problem in MICE with principal component regression

Multiple Imputation (MI) is one of the most popular approaches to addressing missing values in questionnaires and surveys. MI with multivariate imputation by chained equations (MICE) allows flexible imputation of many types of data. In…

Methodology · Statistics 2023-04-24 Edoardo Costantini , Kyle M. Lang , Klaas Sijtsma , Tim Reeskens

A comparison of multiple imputation methods for bivariate hierarchical outcomes

Missing observations are common in cluster randomised trials. Approaches taken to handling such missing data include: complete case analysis, single-level multiple imputation that ignores the clustering, multiple imputation with a fixed…

Methodology · Statistics 2014-07-18 Karla Diaz-Ordaz , Michael G. Kenward , Manuel Gomes , Richard Grieve

Multiple imputation for multilevel data with continuous and binary variables

We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing. The methods are compared from a theoretical point of view and through an extensive…

Methodology · Statistics 2026-05-18 Vincent Audigier , Ian R. White , Shahab Jolani , Thomas P. A. Debray , Matteo Quartagno , James Carpenter , Stef van Buuren , Matthieu Resche-Rigon

Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data

Presence of missing values in a dataset can adversely affect the performance of a classifier. Single and Multiple Imputation are normally performed to fill in the missing values. In this paper, we present several variants of combining…

Machine Learning · Computer Science 2019-10-16 Shehroz S. Khan , Amir Ahmad , Alex Mihailidis

Handling missing data in model-based clustering

Gaussian Mixture models (GMMs) are a powerful tool for clustering, classification and density estimation when clustering structures are embedded in the data. The presence of missing values can largely impact the GMMs estimation process,…

Machine Learning · Statistics 2020-06-05 Alessio Serafini , Thomas Brendan Murphy , Luca Scrucca

Evaluating tree-based imputation methods as an alternative to MICE PMM for drawing inference in empirical studies

Dealing with missing data is an important problem in statistical analysis that is often addressed with imputation procedures. The performance and validity of such methods are of great importance for their application in empirical studies.…

Applications · Statistics 2024-01-19 Jakob Schwerter , Ketevan Gurtskaia , Andrés Romero , Birgit Zeyer-Gliozzo , Markus Pauly

tBayes-MICE: A Bayesian Approach to Multiple Imputation for Time Series Data

Time-series analysis is often affected by missing data, a common problem across several fields, including healthcare and environmental monitoring. Multiple Imputation by Chained Equations (MICE) has been prominent for imputing missing…

Machine Learning · Statistics 2026-04-10 Amuche Ibenegbu , Pierre Lafaye de Micheaux , Rohitash Chandra

Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the…

Methodology · Statistics 2022-06-27 Rose Sisk , Matthew Sperrin , Niels Peek , Maarten van Smeden , Glen P. Martin

Interpretable Prediction Rule Ensembles in the Presence of Missing Data

Prediction Rule Ensembles (PREs) are robust and interpretable statistical learning techniques with potential for predictive analytics, yet their efficacy in the presence of missing data is untested. This study uses multiple imputation to…

Applications · Statistics 2024-10-22 Vincent Schroeder , Jakob Schwerter , Marjolein Fokkema , Philipp Doebler

How to rank imputation methods?

Imputation is an attractive tool for dealing with the widespread issue of missing values. Consequently, studying and developing imputation methods has been an active field of research over the last decade. Faced with an imputation task and…

Methodology · Statistics 2025-07-16 Jeffrey Näf , Krystyna Grzesiak , Erwan Scornet