Related papers: Self-Training for Sample-Efficient Active Learning…

LST: Lexicon-Guided Self-Training for Few-Shot Text Classification

Self-training provides an effective means of using an extremely small amount of labeled data to create pseudo-labels for unlabeled data. Many state-of-the-art self-training approaches hinge on different regularization methods to prevent…

Computation and Language · Computer Science 2022-02-08 Hazel Kim , Jaeman Son , Yo-Sub Han

Uncertainty-aware Self-training for Text Classification with Few Labels

Recent success of large-scale pre-trained language models crucially hinge on fine-tuning them on large amounts of labeled data for the downstream task, that are typically expensive to acquire. In this work, we study self-training as one of…

Computation and Language · Computer Science 2020-06-30 Subhabrata Mukherjee , Ahmed Hassan Awadallah

Reducing Label Effort: Self-Supervised meets Active Learning

Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Javad Zolfaghari Bengar , Joost van de Weijer , Bartlomiej Twardowski , Bogdan Raducanu

Incremental Self-training for Semi-supervised Learning

Semi-supervised learning provides a solution to reduce the dependency of machine learning on labeled data. As one of the efficient semi-supervised techniques, self-training (ST) has received increasing attention. Several advancements have…

Machine Learning · Computer Science 2024-04-22 Jifeng Guo , Zhulin Liu , Tong Zhang , C. L. Philip Chen

SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data

Machine learning and deep learning have shown great promise in mobile sensing applications, including Human Activity Recognition. However, the performance of such models in real-world settings largely depends on the availability of large…

Machine Learning · Computer Science 2021-02-12 Chi Ian Tang , Ignacio Perez-Pozuelo , Dimitris Spathis , Soren Brage , Nick Wareham , Cecilia Mascolo

PT4AL: Using Self-Supervised Pretext Tasks for Active Learning

Labeling a large set of data is expensive. Active learning aims to tackle this problem by asking to annotate only the most informative data from the unlabeled set. We propose a novel active learning approach that utilizes self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2022-07-27 John Seon Keun Yi , Minseok Seo , Jongchan Park , Dong-Geol Choi

SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training

Self-training methods have been explored in recent years and have exhibited great performance in improving semi-supervised learning. This work presents a Simple instance-Adaptive self-Training method (SAT) for semi-supervised text…

Computation and Language · Computer Science 2022-10-25 Hui Chen , Wei Han , Soujanya Poria

Semi-Supervised Text Classification via Self-Pretraining

We present a neural semi-supervised learning model termed Self-Pretraining. Our model is inspired by the classic self-training algorithm. However, as opposed to self-training, Self-Pretraining is threshold-free, it can potentially update…

Computation and Language · Computer Science 2021-10-01 Payam Karisani , Negin Karisani

Predictions For Pre-training Language Models

Language model pre-training has proven to be useful in many language understanding tasks. In this paper, we investigate whether it is still helpful to add the self-training method in the pre-training step and the fine-tuning step. Towards…

Computation and Language · Computer Science 2023-02-17 Tong Guo

Self-Training with Weak Supervision

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such…

Computation and Language · Computer Science 2021-04-13 Giannis Karamanolakis , Subhabrata Mukherjee , Guoqing Zheng , Ahmed Hassan Awadallah

Neural Networks Against (and For) Self-Training: Classification with Small Labeled and Large Unlabeled Sets

We propose a semi-supervised text classifier based on self-training using one positive and one negative property of neural networks. One of the weaknesses of self-training is the semantic drift problem, where noisy pseudo-labels accumulate…

Computation and Language · Computer Science 2024-01-02 Payam Karisani

Enhancing Self-Training Methods

Semi-supervised learning approaches train on small sets of labeled data along with large sets of unlabeled data. Self-training is a semi-supervised teacher-student approach that often suffers from the problem of "confirmation bias" that…

Machine Learning · Computer Science 2023-01-19 Aswathnarayan Radhakrishnan , Jim Davis , Zachary Rabin , Benjamin Lewis , Matthew Scherreik , Roman Ilin

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

Self-training via pseudo labeling is a conventional, simple, and popular pipeline to leverage unlabeled data. In this work, we first construct a strong baseline of self-training (namely ST) for semi-supervised semantic segmentation via…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Lihe Yang , Wei Zhuo , Lei Qi , Yinghuan Shi , Yang Gao

On the Marginal Benefit of Active Learning: Does Self-Supervision Eat Its Cake?

Active learning is the set of techniques for intelligently labeling large unlabeled datasets to reduce the labeling effort. In parallel, recent developments in self-supervised and semi-supervised learning (S4L) provide powerful techniques,…

Machine Learning · Computer Science 2020-11-17 Yao-Chun Chan , Mingchen Li , Samet Oymak

Rethinking Semi-supervised Learning with Language Models

Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks. Currently, there are two popular approaches to make use of…

Computation and Language · Computer Science 2023-05-23 Zhengxiang Shi , Francesco Tonolini , Nikolaos Aletras , Emine Yilmaz , Gabriella Kazai , Yunlong Jiao

Self Training with Ensemble of Teacher Models

In order to train robust deep learning models, large amounts of labelled data is required. However, in the absence of such large repositories of labelled data, unlabeled data can be exploited for the same. Semi-Supervised learning aims to…

Machine Learning · Computer Science 2021-07-20 Soumyadeep Ghosh , Sanjay Kumar , Janu Verma , Awanish Kumar

Learning to Self-Train for Semi-Supervised Few-Shot Classification

Few-shot classification (FSC) is challenging due to the scarcity of labeled training data (e.g. only one labeled data point per class). Meta-learning has shown to achieve promising results by learning to initialize a classification model…

Computer Vision and Pattern Recognition · Computer Science 2019-10-01 Xinzhe Li , Qianru Sun , Yaoyao Liu , Shibao Zheng , Qin Zhou , Tat-Seng Chua , Bernt Schiele

Debiased Self-Training for Semi-Supervised Learning

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement…

Machine Learning · Computer Science 2022-11-10 Baixu Chen , Junguang Jiang , Ximei Wang , Pengfei Wan , Jianmin Wang , Mingsheng Long

Self-Training: A Survey

Semi-supervised algorithms aim to learn prediction functions from a small set of labeled observations and a large set of unlabeled observations. Because this framework is relevant in many applications, they have received a lot of interest…

Machine Learning · Computer Science 2025-02-17 Massih-Reza Amini , Vasilii Feofanov , Loic Pauletto , Lies Hadjadj , Emilie Devijver , Yury Maximov

Self-training Improves Pre-training for Natural Language Understanding

Unsupervised pre-training has led to much recent progress in natural language understanding. In this paper, we study self-training as another way to leverage unlabeled data through semi-supervised learning. To obtain additional data for a…

Computation and Language · Computer Science 2020-10-06 Jingfei Du , Edouard Grave , Beliz Gunel , Vishrav Chaudhary , Onur Celebi , Michael Auli , Ves Stoyanov , Alexis Conneau