Related papers: Imputing missing values with unsupervised random t…

MissForest - nonparametric missing value imputation for mixed-type data

Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set. Missing value imputation offers a…

Applications · Statistics 2014-06-03 Daniel J. Stekhoven , Peter Bühlmann

Benchmarking missing-values approaches for predictive models on health databases

BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values: incomplete observations. These large databases are well suited to train machine-learning models, for…

Machine Learning · Computer Science 2022-02-23 Alexandre Perez-Lebel , Gaël Varoquaux , Marine Le Morvan , Julie Josse , Jean-Baptiste Poline

A Structured Prediction Approach for Missing Value Imputation

Missing value imputation is an important practical problem. There is a large body of work on it, but there does not exist any work that formulates the problem in a structured output setting. Also, most applications have constraints on the…

Machine Learning · Computer Science 2013-11-12 Rahul Kidambi , Vinod Nair , Sundararajan Sellamanickam , S. Sathiya Keerthi

Choosing Imputation Models

Imputing missing values is an important preprocessing step in data analysis, but the literature offers little guidance on how to choose between different imputation models. This letter suggests adopting the imputation model that generates a…

Methodology · Statistics 2021-07-13 Moritz Marbach

Multiple imputation in data that grow over time: A comparison of three strategies

Multiple imputation is a highly recommended technique to deal with missing data, but the application to longitudinal datasets can be done in multiple ways. When a new wave of longitudinal data arrives, we can treat the combined data of…

Methodology · Statistics 2026-05-18 X. M. Kavelaars , S. van Buuren , J. R. van Ginkel

Imputations for High Missing Rate Data in Covariates via Semi-supervised Learning Approach

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

Imputation of missing data using multivariate Gaussian Linear Cluster-Weighted Modeling

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Methodology · Statistics 2023-08-15 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Influence of parallel computing strategies of iterative imputation of missing data: a case study on missForest

Machine learning iterative imputation methods have been well accepted by researchers for imputing missing data, but they can be time-consuming when handling large datasets. To overcome this drawback, parallel computing strategies have been…

Applications · Statistics 2020-04-24 Shangzhi Hong , Yuqi Sun , Hanying Li , Henry S. Lynn

On the consistency of supervised learning with missing values

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Methodology · Statistics 2021-10-26 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Regression-based imputation of explanatory discrete missing data

Imputation of missing values is a strategy for handling non-responses in surveys or data loss in measurement processes, which may be more effective than ignoring them. When the variable represents a count, the literature dealing with this…

Applications · Statistics 2020-07-31 Gilma Hernández-Herrera , Albert Navarro , David Moriña

No imputation without representation

By filling in missing values in datasets, imputation allows these datasets to be used with algorithms that cannot handle missing values by themselves. However, missing values may in principle contribute useful information that is lost…

Machine Learning · Computer Science 2024-10-31 Oliver Urs Lenz , Daniel Peralta , Chris Cornelis

Imputing With Predictive Mean Matching Can Be Severely Biased When Values Are Missing At Random

Predictive mean matching (PMM) is a popular imputation strategy that imputes missing values by borrowing observed values from other cases with similar expectations. We show that, unlike other imputation strategies, PMM is not guaranteed to…

Methodology · Statistics 2025-07-01 Paul T. von Hippel

How to rank imputation methods?

Imputation is an attractive tool for dealing with the widespread issue of missing values. Consequently, studying and developing imputation methods has been an active field of research over the last decade. Faced with an imputation task and…

Methodology · Statistics 2025-07-16 Jeffrey Näf , Krystyna Grzesiak , Erwan Scornet

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

HMVI: Unifying Heterogeneous Attributes with Natural Neighbors for Missing Value Inference

Missing value imputation is a fundamental challenge in machine intelligence, heavily dependent on data completeness. Current imputation methods often handle numerical and categorical attributes independently, overlooking critical…

Machine Learning · Computer Science 2026-01-09 Xiaopeng Luo , Zexi Tan , Zhuowei Wang

Imputation procedures in surveys using nonparametric and machine learning methods: an empirical comparison

Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse,…

Methodology · Statistics 2022-08-23 Mehdi Dagdoug , Camelia Goga , David Haziza

Imputation Scores

Given the prevalence of missing data in modern statistical research, a broad range of methods is available for any given imputation task. How does one choose the `best' imputation method in a given application? The standard approach is to…

Applications · Statistics 2022-12-01 Jeffrey Näf , Meta-Lina Spohn , Loris Michel , Nicolai Meinshausen

Missing Value Estimation Algorithms on Cluster and Representativeness Preservation of Gene Expression Microarray Data

Missing values are largely inevitable in gene expression microarray studies. Data sets often have significant omissions due to individuals dropping out of experiments, errors in data collection, image corruptions, and so on. Missing data…

Quantitative Methods · Quantitative Biology 2018-09-18 Marie Li

A Robust Missing Value Imputation Method MifImpute For Incomplete Molecular Descriptor Data And Comparative Analysis With Other Missing Value Imputation Methods

Missing data imputation is an important research topic in data mining. Large-scale Molecular descriptor data may contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a…

Computational Engineering, Finance, and Science · Computer Science 2013-12-13 Doreswamy , Chanabasayya . M. Vastrad