English
Related papers

Related papers: Missing Data Imputation for Classification Problem…

200 papers

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Missing values are a common phenomenon in all areas of applied research. While various imputation methods are available for metrically scaled variables, methods for categorical data are scarce. An imputation method that has been shown to…

Methodology · Statistics 2017-10-04 Shahla Faisal , Gerhard Tutz

Missing data is a common problem faced with real-world datasets. Imputation is a widely used technique to estimate the missing data. State-of-the-art imputation approaches, such as Generative Adversarial Imputation Nets (GAIN), model the…

Machine Learning · Computer Science 2020-12-02 Saqib Ejaz Awan , Mohammed Bennamoun , Ferdous Sohel , Frank M Sanfilippo , Girish Dwivedi

Data acquisition and recording in the form of databases are routine operations. The process of collecting data, however, may experience irregularities, resulting in databases with missing data. Missing entries might alter analysis…

Databases · Computer Science 2021-08-24 Paul Dixneuf , Fausto Errico , Mathias Glaus

Missing feature values are a significant hurdle for downstream machine-learning tasks such as classification. However, imputation methods for classification might be time-consuming for high-dimensional data, and offer few theoretical…

Machine Learning · Computer Science 2025-05-15 Rahul Bordoloi , Clémence Réda , Saptarshi Bej , Olaf Wolkenhauer

Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in…

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

Numerical data imputation algorithms replace missing values by estimates to leverage incomplete data sets. Current imputation methods seek to minimize the error between the unobserved ground truth and the imputed values. But this strategy…

Machine Learning · Statistics 2023-07-11 Florian Lalande , Kenji Doya

Data Warehouses (DWs) are core components of Business Intelligence (BI). Missing data in DWs have a great impact on data analyses. Therefore, missing data need to be completed. Unlike other existing data imputation methods mainly adapted…

Databases · Computer Science 2022-10-06 Yuzhao Yang , Jérôme Darmont , Franck Ravat , Olivier Teste

Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned…

Machine Learning · Computer Science 2020-11-02 Jiaxuan You , Xiaobai Ma , Daisy Yi Ding , Mykel Kochenderfer , Jure Leskovec

This paper introduces a novel iterative method for missing data imputation that sequentially reduces the mutual information between data and the corresponding missingness mask. Inspired by GAN-based approaches that train generators to…

Machine Learning · Statistics 2025-11-26 Jiahao Yu , Qizhen Ying , Leyang Wang , Ziyue Jiang , Song Liu

Techniques such as clusterization, neural networks and decision making usually rely on algorithms that are not well suited to deal with missing values. However, real world data frequently contains such cases. The simplest solution is to…

Machine Learning · Computer Science 2016-08-16 Davi E. N. Frossard , Igor O. Nunes , Renato A. Krohling

Missing data is a crucial issue when applying machine learning algorithms to real-world datasets. Starting from the simple assumption that two batches extracted randomly from the same dataset should share the same distribution, we leverage…

Machine Learning · Statistics 2020-07-02 Boris Muzellec , Julie Josse , Claire Boyer , Marco Cuturi

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is…

Machine Learning · Computer Science 2020-09-07 Mohammad Kachuee , Kimmo Karkkainen , Orpaz Goldstein , Sajad Darabi , Majid Sarrafzadeh

Analysis of the fairness of machine learning (ML) algorithms recently attracted many researchers' interest. Most ML methods show bias toward protected groups, which limits the applicability of ML models in many applications like crime rate…

Machine Learning · Computer Science 2022-11-03 Haris Mansoor , Sarwan Ali , Shafiq Alam , Muhammad Asad Khan , Umair ul Hassan , Imdadullah Khan

Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample…

Machine Learning · Computer Science 2022-12-23 Zongyu Dai , Zhiqi Bu , Qi Long

Missing data are inevitable in longitudinal studies. Traditional methods, such as the full information maximum likelihood (FIML), are commonly used to handle ignorable missing data. However, they may lead to biased model estimation due to…

Applications · Statistics 2024-01-01 Dandan Tang , Xin Tong

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring.…

Machine Learning · Statistics 2026-05-12 Jicong Fan

Due to the cost or interference of measurement, we need to control measurement system. Assuming that each variable can be measured sequentially, there exists optimal policy choosing next measurement for the former observations. Though…

Machine Learning · Computer Science 2022-04-11 Seongwook Yoon , Jaehyun Kim , Heejeong Lim , Sanghoon Sull

Missing data is a common problem in practical data science settings. Various imputation methods have been developed to deal with missing data. However, even though the labels are available in the training data in many situations, the common…

Machine Learning · Computer Science 2025-01-30 Thu Nguyen , Tuan L. Vo , Pål Halvorsen , Michael A. Riegler
‹ Prev 1 2 3 10 Next ›