SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training

Hui Chen; Wei Han; Soujanya Poria

SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training

Computation and Language 2022-10-25 v1 Machine Learning

Authors: Hui Chen , Wei Han , Soujanya Poria

Abstract

Self-training methods have been explored in recent years and have exhibited great performance in improving semi-supervised learning. This work presents a Simple instance-Adaptive self-Training method (SAT) for semi-supervised text classification. SAT first generates two augmented views for each unlabeled data and then trains a meta-learner to automatically identify the relative strength of augmentations based on the similarity between the original view and the augmented views. The weakly-augmented view is fed to the model to produce a pseudo-label and the strongly-augmented view is used to train the model to predict the same pseudo-label. We conducted extensive experiments and analyses on three text classification datasets and found that with varying sizes of labeled training data, SAT consistently shows competitive performance compared to existing semi-supervised learning methods. Our code can be found at \url{https://github.com/declare-lab/SAT.git}.

Keywords

semi-supervised learning self-supervised learning text classification

Cite

@article{arxiv.2210.12653,
  title  = {SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training},
  author = {Hui Chen and Wei Han and Soujanya Poria},
  journal= {arXiv preprint arXiv:2210.12653},
  year   = {2022}
}

Comments

Accepted to EMNLP 2022 Findings

SAT: Improving Semi-Supervised Text Classification with Simple Instance-Adaptive Self-Training

Abstract

Keywords

Cite

Comments

Related papers