Related papers: Self-Tuning for Data-Efficient Deep Learning

Revisiting Pretraining for Semi-Supervised Learning in the Low-Label Regime

Semi-supervised learning (SSL) addresses the lack of labeled data by exploiting large unlabeled data through pseudolabeling. However, in the extremely low-label regime, pseudo labels could be incorrect, a.k.a. the confirmation bias, and the…

Computer Vision and Pattern Recognition · Computer Science 2022-05-09 Xun Xu , Jingyi Liao , Lile Cai , Manh Cuong Nguyen , Kangkang Lu , Wanyue Zhang , Yasin Yazici , Chuan Sheng Foo

When Semi-Supervised Learning Meets Transfer Learning: Training Strategies, Models and Datasets

Semi-Supervised Learning (SSL) has been proved to be an effective way to leverage both labeled and unlabeled data at the same time. Recent semi-supervised approaches focus on deep neural networks and have achieved promising results on…

Computer Vision and Pattern Recognition · Computer Science 2018-12-14 Hong-Yu Zhou , Avital Oliver , Jianxin Wu , Yefeng Zheng

Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms

Pretraining has become a standard technique in computer vision and natural language processing, which usually helps to improve performance substantially. Previously, the most dominant pretraining method is transfer learning (TL), which uses…

Computer Vision and Pattern Recognition · Computer Science 2020-07-09 Xingyi Yang , Xuehai He , Yuxiao Liang , Yue Yang , Shanghang Zhang , Pengtao Xie

Debiased Self-Training for Semi-Supervised Learning

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement…

Machine Learning · Computer Science 2022-11-10 Baixu Chen , Junguang Jiang , Ximei Wang , Pengfei Wan , Jianmin Wang , Mingsheng Long

Self-supervised visual learning in the low-data regime: a comparative evaluation

Self-Supervised Learning (SSL) is a valuable and robust training methodology for contemporary Deep Neural Networks (DNNs), enabling unsupervised pretraining on a 'pretext task' that does not require ground-truth labels/annotation. This…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Sotirios Konstantakos , Jorgen Cani , Ioannis Mademlis , Despina Ioanna Chalkiadaki , Yuki M. Asano , Efstratios Gavves , Georgios Th. Papadopoulos

Label-Efficient Dataset Pruning via Semi-Supervised Pseudo-Labeling

Dataset pruning reduces the storage and training costs of deep learning by selecting an informative subset from a large dataset. However, most existing pruning methods require fully labeled data, which limits their applicability in…

Machine Learning · Computer Science 2026-05-25 Yeseul Cho , Baekrok Shin , Changmin Kang , Chulhee Yun

Unlabeled Data vs. Pre-trained Knowledge: Rethinking SSL in the Era of Large Models

Semi-supervised learning (SSL) alleviates the cost of data labeling process by exploiting unlabeled data and has achieved promising results. Meanwhile, with the development of large foundation models, exploiting pre-trained models becomes a…

Machine Learning · Computer Science 2025-10-28 Song-Lin Lv , Rui Zhu , Tong Wei , Yu-Feng Li , Lan-Zhe Guo

Rethinking Semi-supervised Learning with Language Models

Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks. Currently, there are two popular approaches to make use of…

Computation and Language · Computer Science 2023-05-23 Zhengxiang Shi , Francesco Tonolini , Nikolaos Aletras , Emine Yilmaz , Gabriella Kazai , Yunlong Jiao

Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization

Localizing keypoints of an object is a basic visual problem. However, supervised learning of a keypoint localization network often requires a large amount of data, which is expensive and time-consuming to obtain. To remedy this, there is an…

Computer Vision and Pattern Recognition · Computer Science 2022-01-25 Can Wang , Sheng Jin , Yingda Guan , Wentao Liu , Chen Qian , Ping Luo , Wanli Ouyang

DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision

Following the success of supervised learning, semi-supervised learning (SSL) is now becoming increasingly popular. SSL is a family of methods, which in addition to a labeled training set, also use a sizable collection of unlabeled data for…

Machine Learning · Computer Science 2022-05-12 Erik Wallin , Lennart Svensson , Fredrik Kahl , Lars Hammarstrand

Towards Realistic Semi-Supervised Learning

Deep learning is pushing the state-of-the-art in many computer vision applications. However, it relies on large annotated data repositories, and capturing the unconstrained nature of the real-world data is yet to be solved. Semi-supervised…

Computer Vision and Pattern Recognition · Computer Science 2022-07-29 Mamshad Nayeem Rizve , Navid Kardan , Mubarak Shah

Using Self-supervised Learning Can Improve Model Fairness

Self-supervised learning (SSL) has become the de facto training paradigm of large models, where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Despite demonstrating comparable performance with…

Machine Learning · Computer Science 2024-06-05 Sofia Yfantidou , Dimitris Spathis , Marios Constantinides , Athena Vakali , Daniele Quercia , Fahim Kawsar

In all LikelihoodS: How to Reliably Select Pseudo-Labeled Data for Self-Training in Semi-Supervised Learning

Self-training is a simple yet effective method within semi-supervised learning. The idea is to iteratively enhance training data by adding pseudo-labeled data. Its generalization performance heavily depends on the selection of these…

Machine Learning · Statistics 2023-03-03 Julian Rodemann , Christoph Jansen , Georg Schollmeyer , Thomas Augustin

A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

Deep supervised learning algorithms typically require a large volume of labeled data to achieve satisfactory performance. However, the process of collecting and labeling such data can be expensive and time-consuming. Self-supervised…

Machine Learning · Computer Science 2024-07-16 Jie Gui , Tuo Chen , Jing Zhang , Qiong Cao , Zhenan Sun , Hao Luo , Dacheng Tao

Doubly Robust Self-Training

Self-training is an important technique for solving semi-supervised learning problems. It leverages unlabeled data by generating pseudo-labels and combining them with a limited labeled dataset for training. The effectiveness of…

Machine Learning · Computer Science 2023-11-06 Banghua Zhu , Mingyu Ding , Philip Jacobson , Ming Wu , Wei Zhan , Michael Jordan , Jiantao Jiao

Integrating Distribution Matching into Semi-Supervised Contrastive Learning for Labeled and Unlabeled Data

The advancement of deep learning has greatly improved supervised image classification. However, labeling data is costly, prompting research into unsupervised learning methods such as contrastive learning. In real-world scenarios, fully…

Artificial Intelligence · Computer Science 2026-01-09 Shogo Nakayama , Masahiro Okuda

Do not trust what you trust: Miscalibration in Semi-supervised Learning

State-of-the-art semi-supervised learning (SSL) approaches rely on highly confident predictions to serve as pseudo-labels that guide the training on unlabeled samples. An inherent drawback of this strategy stems from the quality of the…

Machine Learning · Computer Science 2024-03-26 Shambhavi Mishra , Balamurali Murugesan , Ismail Ben Ayed , Marco Pedersoli , Jose Dolz

Pseudo-Label Noise Suppression Techniques for Semi-Supervised Semantic Segmentation

Semi-supervised learning (SSL) can reduce the need for large labelled datasets by incorporating unlabelled data into the training. This is particularly interesting for semantic segmentation, where labelling data is very costly and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-20 Sebastian Scherer , Robin Schön , Rainer Lienhart

Revisiting Self-Training with Regularized Pseudo-Labeling for Tabular Data

Recent progress in semi- and self-supervised learning has caused a rift in the long-held belief about the need for an enormous amount of labeled data for machine learning and the irrelevancy of unlabeled data. Although it has been…

Machine Learning · Computer Science 2023-03-14 Minwook Kim , Juseong Kim , Giltae Song

Self-supervised learning for autonomous vehicles perception: A conciliation between analytical and learning methods

Nowadays, supervised deep learning techniques yield the best state-of-the-art prediction performances for a wide variety of computer vision tasks. However, such supervised techniques generally require a large amount of manually labeled…

Computer Vision and Pattern Recognition · Computer Science 2020-06-09 Florent Chiaroni , Mohamed-Cherif Rahal , Nicolas Hueber , Frederic Dufaux