Related papers: Data Imputation by Pursuing Better Classification:…

Missing Data Imputation for Supervised Learning

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Handling Missing Data with Graph Representation Learning

Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned…

Machine Learning · Computer Science 2020-11-02 Jiaxuan You , Xiaobai Ma , Daisy Yi Ding , Mykel Kochenderfer , Jure Leskovec

GEDI: A Graph-based End-to-end Data Imputation Framework

Data imputation is an effective way to handle missing data, which is common in practical applications. In this study, we propose and test a novel data imputation process that achieve two important goals: (1) preserve the row-wise…

Machine Learning · Computer Science 2023-09-13 Katrina Chen , Xiuqin Liang , Zheng Ma , Zhibin Zhang

Imputations for High Missing Rate Data in Covariates via Semi-supervised Learning Approach

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification

Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in…

Machine Learning · Statistics 2025-09-30 Jake S. Rhodes , Adam G. Rustad , Sofia Pelagalli Maia , Evan Thacker , Hyunmi Choi , Jose Gutierrez , Tatjana Rundek , Ben Shaw

An End-to-End Model for Time Series Classification In the Presence of Missing Values

Time series classification with missing data is a prevalent issue in time series analysis, as temporal data often contain missing values in practical applications. The traditional two-stage approach, which handles imputation and…

Machine Learning · Computer Science 2024-08-13 Pengshuai Yao , Mengna Liu , Xu Cheng , Fan Shi , Huan Li , Xiufeng Liu , Shengyong Chen

An Interdisciplinary and Cross-Task Review on Missing Data Imputation

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring.…

Machine Learning · Statistics 2026-05-12 Jicong Fan

SEGAN: semi-supervised learning approach for missing data imputation

In many practical real-world applications, data missing is a very common phenomenon, making the development of data-driven artificial intelligence theory and technology increasingly difficult. Data completion is an important method for…

Machine Learning · Computer Science 2024-06-13 Xiaohua Pan , Weifeng Wu , Peiran Liu , Zhen Li , Peng Lu , Peijian Cao , Jianfeng Zhang , Xianfei Qiu , YangYang Wu

Imputation using training labels and classification via label imputation

Missing data is a common problem in practical data science settings. Various imputation methods have been developed to deal with missing data. However, even though the labels are available in the training data in many situations, the common…

Machine Learning · Computer Science 2025-01-30 Thu Nguyen , Tuan L. Vo , Pål Halvorsen , Michael A. Riegler

No imputation without representation

By filling in missing values in datasets, imputation allows these datasets to be used with algorithms that cannot handle missing values by themselves. However, missing values may in principle contribute useful information that is lost…

Machine Learning · Computer Science 2024-10-31 Oliver Urs Lenz , Daniel Peralta , Chris Cornelis

Filling out the missing gaps: Time Series Imputation with Semi-Supervised Learning

Missing data in time series is a challenging issue affecting time series analysis. Missing data occurs due to problems like data drops or sensor malfunctioning. Imputation methods are used to fill in these values, with quality of imputation…

Machine Learning · Computer Science 2023-04-11 Karan Aggarwal , Jaideep Srivastava

Iterative missing value imputation based on feature importance

Many datasets suffer from missing values due to various reasons,which not only increases the processing difficulty of related tasks but also reduces the accuracy of classification. To address this problem, the mainstream approach is to use…

Machine Learning · Computer Science 2024-08-14 Cong Guo , Chun Liu , Wei Yang

Explainable Data Imputation using Constraints

Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or…

Artificial Intelligence · Computer Science 2022-05-11 Sandeep Hans , Diptikalyan Saha , Aniya Aggarwal

Classification of datasets with imputed missing values: does imputation quality matter?

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

Machine Learning · Computer Science 2023-12-20 Tolou Shadbahr , Michael Roberts , Jan Stanczuk , Julian Gilbey , Philip Teare , Sören Dittmer , Matthew Thorpe , Ramon Vinas Torne , Evis Sala , Pietro Lio , Mishal Patel , AIX-COVNET Collaboration , James H. F. Rudd , Tuomas Mirtti , Antti Rannikko , John A. D. Aston , Jing Tang , Carola-Bibiane Schönlieb

Semi-Supervised Data Programming with Subset Selection

The paradigm of data programming, which uses weak supervision in the form of rules/labelling functions, and semi-supervised learning, which augments small amounts of labelled data with a large unlabelled dataset, have shown great promise in…

Machine Learning · Computer Science 2021-06-15 Ayush Maheshwari , Oishik Chatterjee , KrishnaTeja Killamsetty , Ganesh Ramakrishnan , Rishabh Iyer

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

An Innovative Imputation and Classification Approach for Accurate Disease Prediction

Imputation of missing attribute values in medical datasets for extracting hidden knowledge from medical datasets is an interesting research topic of interest which is very challenging. One cannot eliminate missing values in medical records.…

Databases · Computer Science 2016-03-11 Yelipe UshaRani , P. Sammulal

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Methodology · Statistics 2021-10-26 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Quantum-Inspired Optimization Process for Data Imputation

Data imputation is a critical step in data pre-processing, particularly for datasets with missing or unreliable values. This study introduces a novel quantum-inspired imputation framework evaluated on the UCI Diabetes dataset, which…

Quantum Physics · Physics 2025-05-13 Nishikanta Mohanty , Bikash K. Behera , Badshah Mukherjee , Christopher Ferrie

Kernel-Induced Label Propagation by Mapping for Semi-Supervised Classification

Kernel methods have been successfully applied to the areas of pattern recognition and data mining. In this paper, we mainly discuss the issue of propagating labels in kernel space. A Kernel-Induced Label Propagation (Kernel-LP) framework by…

Computer Vision and Pattern Recognition · Computer Science 2019-06-03 Zhao Zhang , Lei Jia , Mingbo Zhao , Guangcan Liu , Meng Wang , Shuicheng Yan