English
Related papers

Related papers: Semi-supervised Learning with Missing Values Imput…

200 papers

We consider the topic of data imputation, a foundational task in machine learning that addresses issues with missing data. To that end, we propose MCFlow, a deep framework for imputation that leverages normalizing flow generative models and…

Machine Learning · Computer Science 2020-03-31 Trevor W. Richardson , Wencheng Wu , Lei Lin , Beilei Xu , Edgar A. Bernal

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

This paper proposes a semi-conditional normalizing flow model for semi-supervised learning. The model uses both labelled and unlabeled data to learn an explicit model of joint distribution over objects and labels. Semi-conditional…

Machine Learning · Statistics 2020-06-23 Andrei Atanov , Alexandra Volokhova , Arsenii Ashukha , Ivan Sosnovik , Dmitry Vetrov

Semi-supervised learning methods are motivated by the availability of large datasets with unlabeled features in addition to labeled data. Unlabeled data is, however, not guaranteed to improve classification performance and has in fact been…

Machine Learning · Statistics 2019-10-25 Xiuming Liu , Dave Zachariah , Johan Wågberg , Thomas B. Schön

Missing data in time series is a challenging issue affecting time series analysis. Missing data occurs due to problems like data drops or sensor malfunctioning. Imputation methods are used to fill in these values, with quality of imputation…

Machine Learning · Computer Science 2023-04-11 Karan Aggarwal , Jaideep Srivastava

Semi-Supervised Learning (SSL) is implemented when algorithms are trained on both labeled and unlabeled data. This is a very common application of ML as it is unrealistic to obtain a fully labeled dataset. Researchers have tackled three…

Machine Learning · Computer Science 2023-08-16 Jason Lu , Michael Ma , Huaze Xu , Zixi Xu

Time series classification with missing data is a prevalent issue in time series analysis, as temporal data often contain missing values in practical applications. The traditional two-stage approach, which handles imputation and…

Machine Learning · Computer Science 2024-08-13 Pengshuai Yao , Mengna Liu , Xu Cheng , Fan Shi , Huan Li , Xiufeng Liu , Shengyong Chen

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

Supervised learning usually requires a large amount of labelled data. However, attaining ground-truth labels is costly for many tasks. Alternatively, weakly supervised methods learn with cheap weak signals that only approximately label some…

Machine Learning · Computer Science 2024-11-26 You Lu , Wenzhuo Song , Chidubem Arachie , Bert Huang

Semi-supervised learning (SSL) constructs classifiers from datasets in which only a subset of observations is labelled, a situation that naturally arises because obtaining labels often requires expert judgement or costly manual effort. This…

Computation · Statistics 2025-12-09 Geoffrey J. McLachlan , Jinran Wu

Semi-Supervised Learning (SSL) is a framework that utilizes both labeled and unlabeled data to enhance model performance. Conventional SSL methods operate under the assumption that labeled and unlabeled data share the same label space.…

Computer Vision and Pattern Recognition · Computer Science 2023-11-16 Noam Fluss , Guy Hacohen , Daphna Weinshall

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of unclassified data, to perform a classification in situations when, typically, there is little labeled data. Even though this is not…

Machine Learning · Statistics 2020-12-11 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

Semi-supervised learning (SSL) constructs classifiers using both labelled and unlabelled data. It leverages information from labelled samples, whose acquisition is often costly or labour-intensive, together with unlabelled data to enhance…

Machine Learning · Statistics 2025-12-29 Jinran Wu , You-Gan Wang , Geoffrey J. McLachlan

Practically, we are often in the dilemma that the labeled data at hand are inadequate to train a reliable classifier, and more seriously, some of these labeled data may be mistakenly labeled due to the various human factors. Therefore, this…

Machine Learning · Computer Science 2019-02-21 Chen Gong , Hengmin Zhang , Jian Yang , Dacheng Tao

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of not classified data, to perform classification, in situations when, typically, the labelled data are few. Even though this is not…

Statistics Theory · Mathematics 2017-12-18 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

The presence of missing values within high-dimensional data is an ubiquitous problem for many applied sciences. A serious limitation of many available data mining and machine learning methods is their inability to handle partially missing…

Machine Learning · Computer Science 2022-08-02 Qi Ma , Sujit K. Ghosh

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

There has been a growing concern about the fairness of decision-making systems based on machine learning. The shortage of labeled data has been always a challenging problem facing machine learning based systems. In such scenarios,…

Machine Learning · Computer Science 2020-01-01 Vahid Noroozi , Sara Bahaadini , Samira Sheikhi , Nooshin Mojab , Philip S. Yu
‹ Prev 1 2 3 10 Next ›