Related papers: Weakly-Supervised Neural Text Classification

Weakly-Supervised Hierarchical Text Classification

Hierarchical text classification, which aims to classify text documents into a given hierarchy, is an important task in many real-world applications. Recently, deep neural models are gaining increasing popularity for text classification due…

Computation and Language · Computer Science 2019-01-01 Yu Meng , Jiaming Shen , Chao Zhang , Jiawei Han

Avoiding Your Teacher's Mistakes: Training Neural Networks with Controlled Weak Supervision

Training deep neural networks requires massive amounts of training data, but for many tasks only limited labeled data is available. This makes weak supervision attractive, using weak or noisy signals like the output of heuristic methods or…

Machine Learning · Computer Science 2017-12-08 Mostafa Dehghani , Aliaksei Severyn , Sascha Rothe , Jaap Kamps

Seed Word Selection for Weakly-Supervised Text Classification with Unsupervised Error Estimation

Weakly-supervised text classification aims to induce text classifiers from only a few user-provided seed words. The vast majority of previous work assumes high-quality seed words are given. However, the expert-annotated seed words are…

Computation and Language · Computer Science 2021-04-21 Yiping Jin , Akshay Bhatia , Dittaya Wanvarie

Self-Training with Weak Supervision

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such…

Computation and Language · Computer Science 2021-04-13 Giannis Karamanolakis , Subhabrata Mukherjee , Guoqing Zheng , Ahmed Hassan Awadallah

Towards Theoretical Understanding of Weak Supervision for Information Retrieval

Neural network approaches have recently shown to be effective in several information retrieval (IR) tasks. However, neural approaches often require large volumes of training data to perform effectively, which is not always available. To…

Information Retrieval · Computer Science 2018-06-14 Hamed Zamani , W. Bruce Croft

Neural Networks Against (and For) Self-Training: Classification with Small Labeled and Large Unlabeled Sets

We propose a semi-supervised text classifier based on self-training using one positive and one negative property of neural networks. One of the weaknesses of self-training is the semantic drift problem, where noisy pseudo-labels accumulate…

Computation and Language · Computer Science 2024-01-02 Payam Karisani

FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these…

Computation and Language · Computer Science 2022-12-16 Tingyu Xia , Yue Wang , Yuan Tian , Yi Chang

Weakly-supervised Text Classification Based on Keyword Graph

Weakly-supervised text classification has received much attention in recent years for it can alleviate the heavy burden of annotating massive data. Among them, keyword-driven methods are the mainstream where user-provided keywords are…

Computation and Language · Computer Science 2021-10-07 Lu Zhang , Jiandong Ding , Yi Xu , Yingyao Liu , Shuigeng Zhou

Neural Ranking Models with Weak Supervision

Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the…

Information Retrieval · Computer Science 2017-05-30 Mostafa Dehghani , Hamed Zamani , Aliaksei Severyn , Jaap Kamps , W. Bruce Croft

XAI-CLASS: Explanation-Enhanced Text Classification with Extremely Weak Supervision

Text classification aims to effectively categorize documents into pre-defined categories. Traditional methods for text classification often rely on large amounts of manually annotated training data, making the process time-consuming and…

Computation and Language · Computer Science 2023-11-02 Daniel Hajialigol , Hanwen Liu , Xuan Wang

Semi-supervised Learning using Robust Loss

The amount of manually labeled data is limited in medical applications, so semi-supervised learning and automatic labeling strategies can be an asset for training deep neural networks. However, the quality of the automatically generated…

Machine Learning · Computer Science 2022-03-04 Wenhui Cui , Haleh Akrami , Anand A. Joshi , Richard M. Leahy

LIME: Weakly-Supervised Text Classification Without Seeds

In weakly-supervised text classification, only label names act as sources of supervision. Predominant approaches to weakly-supervised text classification utilize a two-phase framework, where test samples are first assigned pseudo-labels and…

Computation and Language · Computer Science 2022-10-14 Seongmin Park , Jihwa Lee

Denoising Multi-Source Weak Supervision for Neural Text Classification

We study the problem of learning neural text classifiers without using any labeled data, but only easy-to-provide rules as multiple weak supervision sources. This problem is challenging because rule-induced weak labels are often noisy and…

Computation and Language · Computer Science 2021-03-12 Wendi Ren , Yinghao Li , Hanting Su , David Kartchner , Cassie Mitchell , Chao Zhang

Learning to Learn from Weak Supervision by Full Supervision

In this paper, we propose a method for training neural networks when we have a large set of data with weak labels and a small amount of data with true labels. In our proposed model, we train two neural networks: a target network, the…

Machine Learning · Statistics 2017-12-01 Mostafa Dehghani , Aliaksei Severyn , Sascha Rothe , Jaap Kamps

MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information

We study the problem of weakly supervised text classification, which aims to classify text documents into a set of pre-defined categories with category surface names only and without any annotated training document provided. Most existing…

Computation and Language · Computer Science 2023-10-24 Yu Zhang , Shweta Garg , Yu Meng , Xiusi Chen , Jiawei Han

Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks

Text categorization is an essential task in Web content analysis. Considering the ever-evolving Web data and new emerging categories, instead of the laborious supervised setting, in this paper, we focus on the minimally-supervised setting…

Computation and Language · Computer Science 2021-02-24 Xinyang Zhang , Chenwei Zhang , Luna Xin Dong , Jingbo Shang , Jiawei Han

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

The challenging field of scene text detection requires complex data annotation, which is time-consuming and expensive. Techniques, such as weak supervision, can reduce the amount of data needed. In this paper we propose a weak supervision…

Computer Vision and Pattern Recognition · Computer Science 2022-01-14 Emanuel Metzenthin , Christian Bartz , Christoph Meinel

Semi-Supervised Text Classification via Self-Pretraining

We present a neural semi-supervised learning model termed Self-Pretraining. Our model is inspired by the classic self-training algorithm. However, as opposed to self-training, Self-Pretraining is threshold-free, it can potentially update…

Computation and Language · Computer Science 2021-10-01 Payam Karisani , Negin Karisani

PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training

Weakly-supervised text classification trains a classifier using the label name of each target class as the only supervision, which largely reduces human annotation efforts. Most existing methods first use the label names as static…

Computation and Language · Computer Science 2023-10-23 Yunyi Zhang , Minhao Jiang , Yu Meng , Yu Zhang , Jiawei Han

WeText: Scene Text Detection under Weak Supervision

The requiring of large amounts of annotated training data has become a common constraint on various deep learning systems. In this paper, we propose a weakly supervised scene text detection method (WeText) that trains robust and accurate…

Computer Vision and Pattern Recognition · Computer Science 2017-10-16 Shangxuan Tian , Shijian Lu , Chongshou Li