Related papers: Missing Data Imputation for Supervised Learning

Data Imputation by Pursuing Better Classification: A Supervised Kernel-Based Method

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

No imputation without representation

By filling in missing values in datasets, imputation allows these datasets to be used with algorithms that cannot handle missing values by themselves. However, missing values may in principle contribute useful information that is lost…

Machine Learning · Computer Science 2024-10-31 Oliver Urs Lenz , Daniel Peralta , Chris Cornelis

Imputations for High Missing Rate Data in Covariates via Semi-supervised Learning Approach

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

Generative Imputation and Stochastic Prediction

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

Missing Data Imputation for Classification Problems

Imputation of missing data is a common application in various classification problems where the feature training matrix has missingness. A widely used solution to this imputation problem is based on the lazy learning technique, $k$-nearest…

Machine Learning · Statistics 2020-02-26 Arkopal Choudhury , Michael R. Kosorok

Filling out the missing gaps: Time Series Imputation with Semi-Supervised Learning

Missing data in time series is a challenging issue affecting time series analysis. Missing data occurs due to problems like data drops or sensor malfunctioning. Imputation methods are used to fill in these values, with quality of imputation…

Machine Learning · Computer Science 2023-04-11 Karan Aggarwal , Jaideep Srivastava

Classification of datasets with imputed missing values: does imputation quality matter?

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

Machine Learning · Computer Science 2023-12-20 Tolou Shadbahr , Michael Roberts , Jan Stanczuk , Julian Gilbey , Philip Teare , Sören Dittmer , Matthew Thorpe , Ramon Vinas Torne , Evis Sala , Pietro Lio , Mishal Patel , AIX-COVNET Collaboration , James H. F. Rudd , Tuomas Mirtti , Antti Rannikko , John A. D. Aston , Jing Tang , Carola-Bibiane Schönlieb

On the consistency of supervised learning with missing values

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

On the Relation between Prediction and Imputation Accuracy under Missing Covariates

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for…

Machine Learning · Statistics 2022-03-23 Burim Ramosaj , Justus Tulowietzki , Markus Pauly

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

Navigating Data Corruption in Machine Learning: Balancing Quality, Quantity, and Imputation Strategies

Data corruption, including missing and noisy data, poses significant challenges in real-world machine learning. This study investigates the effects of data corruption on model performance and explores strategies to mitigate these effects…

Machine Learning · Computer Science 2025-05-22 Qi Liu , Wanjing Ma

Machine Learning Based Missing Values Imputation in Categorical Datasets

In order to predict and fill in the gaps in categorical datasets, this research looked into the use of machine learning algorithms. The emphasis was on ensemble models constructed using the Error Correction Output Codes framework, including…

Machine Learning · Computer Science 2024-09-13 Muhammad Ishaq , Sana Zahir , Laila Iftikhar , Mohammad Farhad Bulbul , Seungmin Rho , Mi Young Lee

An Interdisciplinary and Cross-Task Review on Missing Data Imputation

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring.…

Machine Learning · Statistics 2026-05-12 Jicong Fan

Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification

Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in…

Machine Learning · Statistics 2025-09-30 Jake S. Rhodes , Adam G. Rustad , Sofia Pelagalli Maia , Evan Thacker , Hyunmi Choi , Jose Gutierrez , Tatjana Rundek , Ben Shaw

Meta-Imputation Balanced (MIB): An Ensemble Approach for Handling Missing Data in Biomedical Machine Learning

Missing data represents a fundamental challenge in machine learning applications, often reducing model performance and reliability. This problem is particularly acute in fields like bioinformatics and clinical machine learning, where…

Machine Learning · Computer Science 2025-09-04 Fatemeh Azad , Zoran Bosnić , Matjaž Kukar

Probabilistic Imputation for Time-series Classification with Missing Data

Multivariate time series data for real-world applications typically contain a significant amount of missing values. The dominant approach for classification with such missing values is to impute them heuristically with specific values…

Machine Learning · Computer Science 2023-08-15 SeungHyun Kim , Hyunsu Kim , EungGu Yun , Hwangrae Lee , Jaehun Lee , Juho Lee

Imputation for prediction: beware of diminishing returns

Missing values are prevalent across various fields, posing challenges for training and deploying predictive models. In this context, imputation is a common practice, driven by the hope that accurate imputations will enhance predictions.…

Artificial Intelligence · Computer Science 2025-02-21 Marine Le Morvan , Gaël Varoquaux

Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the…

Methodology · Statistics 2022-06-27 Rose Sisk , Matthew Sperrin , Niels Peek , Maarten van Smeden , Glen P. Martin

Performance comparison of State-of-the-art Missing Value Imputation Algorithms on Some Bench mark Datasets

Decision making from data involves identifying a set of attributes that contribute to effective decision making through computational intelligence. The presence of missing values greatly influences the selection of right set of attributes…

Machine Learning · Computer Science 2013-07-23 M. Naresh Kumar

Does imputation matter? Benchmark for predictive models

Incomplete data are common in practical applications. Most predictive machine learning models do not handle missing values so they require some preprocessing. Although many algorithms are used for data imputation, we do not understand the…

Machine Learning · Statistics 2020-07-07 Katarzyna Woźnica , Przemysław Biecek