Related papers: Evaluating multiple models using labeled and unlab…

Semisupervised Classifier Evaluation and Recalibration

How many labeled examples are needed to estimate a classifier's performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the…

Machine Learning · Computer Science 2012-10-09 Peter Welinder , Max Welling , Pietro Perona

Semi-Supervised Approaches to Efficient Evaluation of Model Prediction Performance

In many modern machine learning applications, the outcome is expensive or time-consuming to collect while the predictor information is easy to obtain. Semi-supervised learning (SSL) aims at utilizing large amounts of `unlabeled' data along…

Methodology · Statistics 2017-11-16 Jessica Gronsbell , Tianxi Cai

Ranking and combining multiple predictors without labeled data

In a broad range of classification and decision making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This scenario is different from the standard…

Machine Learning · Statistics 2014-02-07 Fabio Parisi , Francesco Strino , Boaz Nadler , Yuval Kluger

Improvability Through Semi-Supervised Learning: A Survey of Theoretical Results

Semi-supervised learning is a setting in which one has labeled and unlabeled data available. In this survey we explore different types of theoretical results when one uses unlabeled data in classification and regression tasks. Most methods…

Machine Learning · Computer Science 2020-07-31 Alexander Mey , Marco Loog

Semi-Supervised Sparse Gaussian Classification: Provable Benefits of Unlabeled Data

The premise of semi-supervised learning (SSL) is that combining labeled and unlabeled data yields significantly more accurate models. Despite empirical successes, the theoretical understanding of SSL is still far from complete. In this…

Machine Learning · Statistics 2024-09-06 Eyar Azar , Boaz Nadler

Reliable Semi-Supervised Learning when Labels are Missing at Random

Semi-supervised learning methods are motivated by the availability of large datasets with unlabeled features in addition to labeled data. Unlabeled data is, however, not guaranteed to improve classification performance and has in fact been…

Machine Learning · Statistics 2019-10-25 Xiuming Liu , Dave Zachariah , Johan Wågberg , Thomas B. Schön

Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario

Learning algorithms normally assume that there is at most one annotation or label per data point. However, in some scenarios, such as medical diagnosis and on-line collaboration,multiple annotations may be available. In either case,…

Machine Learning · Computer Science 2012-03-19 Yan Yan , Romer Rosales , Glenn Fung , Jennifer Dy

AllMatch: Exploiting All Unlabeled Data for Semi-Supervised Learning

Existing semi-supervised learning algorithms adopt pseudo-labeling and consistency regulation techniques to introduce supervision signals for unlabeled samples. To overcome the inherent limitation of threshold-based pseudo-labeling, prior…

Machine Learning · Computer Science 2024-07-10 Zhiyu Wu , Jinshi Cui

Informative missingness and its implications in semi-supervised learning

Semi-supervised learning (SSL) constructs classifiers using both labelled and unlabelled data. It leverages information from labelled samples, whose acquisition is often costly or labour-intensive, together with unlabelled data to enhance…

Machine Learning · Statistics 2025-12-29 Jinran Wu , You-Gan Wang , Geoffrey J. McLachlan

Unlabeled Data vs. Pre-trained Knowledge: Rethinking SSL in the Era of Large Models

Semi-supervised learning (SSL) alleviates the cost of data labeling process by exploiting unlabeled data and has achieved promising results. Meanwhile, with the development of large foundation models, exploiting pre-trained models becomes a…

Machine Learning · Computer Science 2025-10-28 Song-Lin Lv , Rui Zhu , Tong Wei , Yu-Feng Li , Lan-Zhe Guo

Semi-Supervised learning with Density-Ratio Estimation

In this paper, we study statistical properties of semi-supervised learning, which is considered as an important problem in the community of machine learning. In the standard supervised learning, only the labeled data is observed. The…

Machine Learning · Statistics 2012-04-19 Masanori Kawakita , Takafumi Kanamori

Semi-Unsupervised Learning: Clustering and Classifying using Ultra-Sparse Labels

In semi-supervised learning for classification, it is assumed that every ground truth class of data is present in the small labelled dataset. Many real-world sparsely-labelled datasets are plausibly not of this type. It could easily be the…

Machine Learning · Statistics 2021-01-11 Matthew Willetts , Stephen J Roberts , Christopher C Holmes

Learning to Learn in a Semi-Supervised Fashion

To address semi-supervised learning from both labeled and unlabeled data, we present a novel meta-learning scheme. We particularly consider that labeled and unlabeled data share disjoint ground truth label sets, which can be seen tasks like…

Computer Vision and Pattern Recognition · Computer Science 2020-08-26 Yun-Chun Chen , Chao-Te Chou , Yu-Chiang Frank Wang

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning

Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i.e., all unlabeled examples are equally weighted. But not all unlabeled data are equal. In this paper we study…

Machine Learning · Computer Science 2020-10-30 Zhongzheng Ren , Raymond A. Yeh , Alexander G. Schwing

SSLfmm: An R Package for Semi-Supervised Learning with a Mixed-Missingness Mechanism in Finite Mixture Models

Semi-supervised learning (SSL) constructs classifiers from datasets in which only a subset of observations is labelled, a situation that naturally arises because obtaining labels often requires expert judgement or costly manual effort. This…

Computation · Statistics 2025-12-09 Geoffrey J. McLachlan , Jinran Wu

MSMatch: Semi-Supervised Multispectral Scene Classification with Few Labels

Supervised learning techniques are at the center of many tasks in remote sensing. Unfortunately, these methods, especially recent deep learning methods, often require large amounts of labeled data for training. Even though satellites…

Machine Learning · Computer Science 2021-08-03 Pablo Gómez , Gabriele Meoni

Semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of not classified data, to perform classification, in situations when, typically, the labelled data are few. Even though this is not…

Statistics Theory · Mathematics 2017-12-18 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

On semi-supervised learning

Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of unclassified data, to perform a classification in situations when, typically, there is little labeled data. Even though this is not…

Machine Learning · Statistics 2020-12-11 Alejandro Cholaquidis , Ricardo Fraiman , Mariela Sued

Semi-supervised Deep Learning for Image Classification with Distribution Mismatch: A Survey

Deep learning methodologies have been employed in several different fields, with an outstanding success in image recognition applications, such as material quality control, medical imaging, autonomous driving, etc. Deep learning models rely…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Saul Calderon-Ramirez , Shengxiang Yang , David Elizondo

Are labels informative in semi-supervised learning? -- Estimating and leveraging the missing-data mechanism

Semi-supervised learning is a powerful technique for leveraging unlabeled data to improve machine learning models, but it can be affected by the presence of ``informative'' labels, which occur when some classes are more likely to be labeled…

Machine Learning · Statistics 2023-02-16 Aude Sportisse , Hugo Schmutz , Olivier Humbert , Charles Bouveyron , Pierre-Alexandre Mattei