Related papers: Missing Data Imputation using Neural Cellular Auto…

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

Missing Data Imputation for Supervised Learning

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Imputation of missing data using multivariate Gaussian Linear Cluster-Weighted Modeling

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Methodology · Statistics 2023-08-15 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Imputation of Missing Data with Class Imbalance using Conditional Generative Adversarial Networks

Missing data is a common problem faced with real-world datasets. Imputation is a widely used technique to estimate the missing data. State-of-the-art imputation approaches, such as Generative Adversarial Imputation Nets (GAIN), model the…

Machine Learning · Computer Science 2020-12-02 Saqib Ejaz Awan , Mohammed Bennamoun , Ferdous Sohel , Frank M Sanfilippo , Girish Dwivedi

Estimation of Missing Data Using Computational Intelligence and Decision Trees

This paper introduces a novel paradigm to impute missing data that combines a decision tree with an auto-associative neural network (AANN) based model and a principal component analysis-neural network (PCA-NN) based model. For each model,…

Applications · Statistics 2007-09-12 George Ssali , Tshilidzi Marwala

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Methodology · Statistics 2021-10-26 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

STING: Self-attention based Time-series Imputation Networks using GAN

Time series data are ubiquitous in real-world applications. However, one of the most common problems is that the time series data could have missing values by the inherent nature of the data collection process. So imputing missing values…

Machine Learning · Computer Science 2022-09-23 Eunkyu Oh , Taehun Kim , Yunhu Ji , Sushil Khyalia

Table Transformers for Imputing Textual Attributes

Missing data in tabular dataset is a common issue as the performance of downstream tasks usually depends on the completeness of the training dataset. Previous missing data imputation methods focus on numeric and categorical columns, but we…

Computation and Language · Computer Science 2024-11-04 Ting-Ruen Wei , Yuan Wang , Yoshitaka Inoue , Hsin-Tai Wu , Yi Fang

Missingness Augmentation: A General Approach for Improving Generative Imputation Models

Missing data imputation is a fundamental problem in data analysis, and many studies have been conducted to improve its performance by exploring model structures and learning procedures. However, data augmentation, as a simple yet effective…

Machine Learning · Computer Science 2023-04-07 Yufeng Wang , Dan Li , Cong Xu , Min Yang

MIDA: Multiple Imputation using Denoising Autoencoders

Missing data is a significant problem impacting all domains. State-of-the-art framework for minimizing missing data bias is multiple imputation, for which the choice of an imputation model remains nontrivial. We propose a multiple…

Machine Learning · Computer Science 2018-02-20 Lovedeep Gondara , Ke Wang

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or…

Neural and Evolutionary Computing · Computer Science 2015-12-07 Collins Leke , Tshilidzi Marwala , Satyakama Paul

An Interdisciplinary and Cross-Task Review on Missing Data Imputation

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring.…

Machine Learning · Statistics 2026-05-12 Jicong Fan

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data

Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample…

Machine Learning · Computer Science 2022-12-23 Zongyu Dai , Zhiqi Bu , Qi Long

Missing Data Multiple Imputation for Tabular Q-Learning in Online RL

Missing data in online reinforcement learning (RL) poses challenges compared to missing data in standard tabular data or in offline policy learning. The need to impute and act at each time step means that imputation cannot be put off until…

Machine Learning · Statistics 2025-10-14 Kyla Chasalow , Skyler Wu , Susan Murphy

Data Imputation by Pursuing Better Classification: A Supervised Kernel-Based Method

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

Autoencoder, Principal Component Analysis and Support Vector Regression for Data Imputation

Data collection often results in records that have missing values or variables. This investigation compares 3 different data imputation models and identifies their merits by using accuracy measures. Autoencoder Neural Networks, Principal…

Artificial Intelligence · Computer Science 2007-09-18 Vukosi N. Marivate , Fulufhelo V. Nelwamodo , Tshilidzi Marwala

MAIN: Multihead-Attention Imputation Networks

The problem of missing data, usually absent incurated and competition-standard datasets, is an unfortunate reality for most machine learning models used in industry applications. Recent work has focused on understanding the nature and the…

Machine Learning · Computer Science 2022-01-25 Spyridon Mouselinos , Kyriakos Polymenakos , Antonis Nikitakis , Konstantinos Kyriakopoulos

Multiple Imputation with Denoising Autoencoder using Metamorphic Truth and Imputation Feedback

Although data may be abundant, complete data is less so, due to missing columns or rows. This missingness undermines the performance of downstream data products that either omit incomplete cases or create derived completed data for…

Machine Learning · Computer Science 2020-06-26 Haw-minn Lu , Giancarlo Perrone , José Unpingco