Related papers: Improving Missing Data Imputation with Deep Genera…

Probabilistic Imputation for Time-series Classification with Missing Data

Multivariate time series data for real-world applications typically contain a significant amount of missing values. The dominant approach for classification with such missing values is to impute them heuristically with specific values…

Machine Learning · Computer Science 2023-08-15 SeungHyun Kim , Hyunsu Kim , EungGu Yun , Hwangrae Lee , Jaehun Lee , Juho Lee

Missing Value Imputation Based on Deep Generative Models

Missing values widely exist in many real-world datasets, which hinders the performing of advanced data analytics. Properly filling these missing values is crucial but challenging, especially when the missing rate is high. Many approaches…

Machine Learning · Computer Science 2018-08-07 Hongbao Zhang , Pengtao Xie , Eric Xing

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

Identifiable Generative Models for Missing Not at Random Data Imputation

Real-world datasets often have missing values associated with complex generative processes, where the cause of the missingness may not be fully observed. This is known as missing not at random (MNAR) data. However, many imputation methods…

Machine Learning · Computer Science 2021-10-29 Chao Ma , Cheng Zhang

Missing Value Estimation Algorithms on Cluster and Representativeness Preservation of Gene Expression Microarray Data

Missing values are largely inevitable in gene expression microarray studies. Data sets often have significant omissions due to individuals dropping out of experiments, errors in data collection, image corruptions, and so on. Missing data…

Quantitative Methods · Quantitative Biology 2018-09-18 Marie Li

Transformed Distribution Matching for Missing Value Imputation

We study the problem of imputing missing values in a dataset, which has important applications in many domains. The key to missing value imputation is to capture the data distribution with incomplete samples and impute the missing values…

Machine Learning · Computer Science 2023-06-26 He Zhao , Ke Sun , Amir Dezfouli , Edwin Bonilla

Classification of datasets with imputed missing values: does imputation quality matter?

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

Machine Learning · Computer Science 2023-12-20 Tolou Shadbahr , Michael Roberts , Jan Stanczuk , Julian Gilbey , Philip Teare , Sören Dittmer , Matthew Thorpe , Ramon Vinas Torne , Evis Sala , Pietro Lio , Mishal Patel , AIX-COVNET Collaboration , James H. F. Rudd , Tuomas Mirtti , Antti Rannikko , John A. D. Aston , Jing Tang , Carola-Bibiane Schönlieb

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or…

Neural and Evolutionary Computing · Computer Science 2015-12-07 Collins Leke , Tshilidzi Marwala , Satyakama Paul

No imputation without representation

By filling in missing values in datasets, imputation allows these datasets to be used with algorithms that cannot handle missing values by themselves. However, missing values may in principle contribute useful information that is lost…

Machine Learning · Computer Science 2024-10-31 Oliver Urs Lenz , Daniel Peralta , Chris Cornelis

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

Deep Imputation of Missing Values in Time Series Health Data: A Review with Benchmarking

The imputation of missing values in multivariate time series (MTS) data is critical in ensuring data quality and producing reliable data-driven predictive models. Apart from many statistical approaches, a few recent studies have proposed…

Machine Learning · Computer Science 2023-05-17 Maksims Kazijevs , Manar D. Samad

Missing Data Imputation for Supervised Learning

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Evolving imputation strategies for missing data in classification problems with TPOT

Missing data has a ubiquitous presence in real-life applications of machine learning techniques. Imputation methods are algorithms conceived for restoring missing values in the data, based on other entries in the database. The choice of the…

Machine Learning · Computer Science 2017-08-16 Unai Garciarena , Roberto Santana , Alexander Mendiburu

DiffPuter: Empowering Diffusion Models for Missing Data Imputation

Generative models play an important role in missing data imputation in that they aim to learn the joint distribution of full data. However, applying advanced deep generative models (such as Diffusion models) to missing data imputation is…

Machine Learning · Computer Science 2025-05-27 Hengrui Zhang , Liancheng Fang , Qitian Wu , Philip S. Yu

Chains of Autoreplicative Random Forests for missing value imputation in high-dimensional datasets

Missing values are a common problem in data science and machine learning. Removing instances with missing values can adversely affect the quality of further data analysis. This is exacerbated when there are relatively many more features…

Machine Learning · Computer Science 2023-01-03 Ekaterina Antonenko , Jesse Read

Diffusion Transformers for Imputation: Statistical Efficiency and Uncertainty Quantification

Imputation methods play a critical role in enhancing the quality of practical time-series data, which often suffer from pervasive missing values. Recently, diffusion-based generative imputation methods have demonstrated remarkable success…

Machine Learning · Computer Science 2025-10-03 Zeqi Ye , Minshuo Chen

Deep Generative Imputation Model for Missing Not At Random Data

Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the…

Machine Learning · Computer Science 2025-05-27 Jialei Chen , Yuanbo Xu , Pengyang Wang , Yongjian Yang

Certain and Approximately Certain Models for Statistical Learning

Real-world data is often incomplete and contains missing values. To train accurate models over real-world datasets, users need to spend a substantial amount of time and resources imputing and finding proper values for missing data items. In…

Machine Learning · Statistics 2024-03-05 Cheng Zhen , Nischal Aryal , Arash Termehchy , Alireza Aghasi , Amandeep Singh Chabada

Diffusion models for missing value imputation in tabular data

Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information. In this task, several deep generative modeling methods have been proposed and demonstrated…

Machine Learning · Computer Science 2023-03-14 Shuhan Zheng , Nontawat Charoenphakdee

Missing Data Imputation using Neural Cellular Automata

When working with tabular data, missingness is always one of the most painful problems. Throughout many years, researchers have continuously explored better and better ways to impute missing data. Recently, with the rapid development…

Machine Learning · Computer Science 2025-09-09 Tin Luu , Binh Nguyen , Man Ngo