Related papers: Machine Learning Based Missing Values Imputation i…

Missing Data Imputation for Supervised Learning

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical…

Machine Learning · Computer Science 2023-04-25 Zhi Chen , Sarah Tan , Urszula Chajewska , Cynthia Rudin , Rich Caruana

Algorithm for Missing Values Imputation in Categorical Data with Use of Association Rules

This paper presents algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using the…

Machine Learning · Computer Science 2012-11-09 Jiří Kaiser

Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading…

Machine Learning · Statistics 2016-09-28 Talayeh Razzaghi , Oleg Roderick , Ilya Safro , Nicholas Marko

Missing Value Estimation using Clustering and Deep Learning within Multiple Imputation Framework

Missing values in tabular data restrict the use and performance of machine learning, requiring the imputation of missing values. The most popular imputation algorithm is arguably multiple imputations using chains of equations (MICE), which…

Machine Learning · Computer Science 2022-03-01 Manar D Samad , Sakib Abrar , Norou Diawara

Fast Imbalanced Classification of Healthcare Data with Missing Values

In medical domain, data features often contain missing values. This can create serious bias in the predictive modeling. Typical standard data mining methods often produce poor performance measures. In this paper, we propose a new method to…

Machine Learning · Statistics 2015-03-24 Talayeh Razzaghi , Oleg Roderick , Ilya Safro , Nick Marko

ELMV: an Ensemble-Learning Approach for Analyzing Electrical Health Records with Significant Missing Values

Many real-world Electronic Health Record (EHR) data contains a large proportion of missing values. Leaving substantial portion of missing information unaddressed usually causes significant bias, which leads to invalid conclusion to be…

Machine Learning · Computer Science 2020-11-04 Lucas J. Liu , Hongwei Zhang , Jianzhong Di , Jin Chen

Meta-Imputation Balanced (MIB): An Ensemble Approach for Handling Missing Data in Biomedical Machine Learning

Missing data represents a fundamental challenge in machine learning applications, often reducing model performance and reliability. This problem is particularly acute in fields like bioinformatics and clinical machine learning, where…

Machine Learning · Computer Science 2025-09-04 Fatemeh Azad , Zoran Bosnić , Matjaž Kukar

An Interdisciplinary and Cross-Task Review on Missing Data Imputation

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring.…

Machine Learning · Statistics 2026-05-12 Jicong Fan

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

Machine learning with incomplete datasets using multi-objective optimization models

Machine learning techniques have been developed to learn from complete data. When missing values exist in a dataset, the incomplete data should be preprocessed separately by removing data points with missing values or imputation. In this…

Machine Learning · Computer Science 2020-12-25 Hadi A. Khorshidi , Michael Kirley , Uwe Aickelin

Denoising ESG: quantifying data uncertainty from missing data with Machine Learning and prediction intervals

Environmental, Social, and Governance (ESG) datasets are frequently plagued by significant data gaps, leading to inconsistencies in ESG ratings due to varying imputation methods. This paper explores the application of established machine…

Machine Learning · Computer Science 2024-07-30 Sergio Caprioli , Jacopo Foschi , Riccardo Crupi , Alessandro Sabatino

Imputation techniques on missing values in breast cancer treatment and fertility data

Clinical decision support using data mining techniques offers more intelligent way to reduce the decision error in the last few years. However, clinical datasets often suffer from high missingness, which adversely impacts the quality of…

Machine Learning · Computer Science 2020-11-20 Xuetong Wu , Hadi Akbarzadeh Khorshidi , Uwe Aickelin , Zobaida Edib , Michelle Peate

Classification of datasets with imputed missing values: does imputation quality matter?

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

Machine Learning · Computer Science 2023-12-20 Tolou Shadbahr , Michael Roberts , Jan Stanczuk , Julian Gilbey , Philip Teare , Sören Dittmer , Matthew Thorpe , Ramon Vinas Torne , Evis Sala , Pietro Lio , Mishal Patel , AIX-COVNET Collaboration , James H. F. Rudd , Tuomas Mirtti , Antti Rannikko , John A. D. Aston , Jing Tang , Carola-Bibiane Schönlieb

Missing Value Imputation Based on Deep Generative Models

Missing values widely exist in many real-world datasets, which hinders the performing of advanced data analytics. Properly filling these missing values is crucial but challenging, especially when the missing rate is high. Many approaches…

Machine Learning · Computer Science 2018-08-07 Hongbao Zhang , Pengtao Xie , Eric Xing

Handling missing data in a neural network approach for the identification of charged particles in a multilayer detector

Identification of charged particles in a multilayer detector by the energy loss technique may also be achieved by the use of a neural network. The performance of the network becomes worse when a large fraction of information is missing, for…

Methodology · Statistics 2020-04-14 S. Riggi , D. Riggi , F. Riggi

Performance comparison of State-of-the-art Missing Value Imputation Algorithms on Some Bench mark Datasets

Decision making from data involves identifying a set of attributes that contribute to effective decision making through computational intelligence. The presence of missing values greatly influences the selection of right set of attributes…

Machine Learning · Computer Science 2013-07-23 M. Naresh Kumar

An ensemble learning method for variable selection: application to high dimensional data and missing values

Standard approaches for variable selection in linear models are not tailored to deal properly with high-dimensional and incomplete data. Currently, methods dedicated to high-dimensional data handle missing values by ad-hoc strategies, like…

Methodology · Statistics 2021-06-09 Avner Bar-Hen , Vincent Audigier

Missing Data: A Comparison of Neural Network and Expectation Maximisation Techniques

The estimation of missing input vector elements in real time processing applications requires a system that possesses the knowledge of certain characteristics such as correlations between variables, which are inherent in the input space.…

Applications · Statistics 2007-05-23 Fulufhelo V. Nelwamondo , Shakir Mohamed , Tshilidzi Marwala

Missing Values Handling for Machine Learning Portfolios

We characterize the structure and origins of missingness for 159 cross-sectional return predictors and study missing value handling for portfolios constructed using machine learning. Simply imputing with cross-sectional means performs well…

Methodology · Statistics 2024-01-15 Andrew Y. Chen , Jack McCoy