Related papers: Weakly-Supervised Hierarchical Text Classification

Weakly-Supervised Neural Text Classification

Deep neural networks are gaining increasing popularity for the classic text classification task, due to their strong expressive power and less requirement for feature engineering. Despite such attractiveness, neural text classification…

Information Retrieval · Computer Science 2018-09-13 Yu Meng , Jiaming Shen , Chao Zhang , Jiawei Han

HDLTex: Hierarchical Deep Learning for Text Classification

The continually increasing number of documents produced each year necessitates ever improving information processing methods for searching, retrieving, and organizing text. Central to these information processing methods is document…

Machine Learning · Computer Science 2018-03-29 Kamran Kowsari , Donald E. Brown , Mojtaba Heidarysafa , Kiana Jafari Meimandi , Matthew S. Gerber , Laura E. Barnes

Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks

In hierarchical text classification, we perform a sequence of inference steps to predict the category of a document from top to bottom of a given class taxonomy. Most of the studies have focused on developing novels neural network…

Computation and Language · Computer Science 2020-05-25 Kervy Rivas Rojas , Gina Bustamante , Arturo Oncevay , Marco A. Sobrevilla Cabezudo

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

Hierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to…

Machine Learning · Computer Science 2019-02-26 Huiru Xiao , Xin Liu , Yangqiu Song

MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information

We study the problem of weakly supervised text classification, which aims to classify text documents into a set of pre-defined categories with category surface names only and without any annotated training document provided. Most existing…

Computation and Language · Computer Science 2023-10-24 Yu Zhang , Shweta Garg , Yu Meng , Xiusi Chen , Jiawei Han

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

Categorizing documents into a given label hierarchy is intuitively appealing due to the ubiquity of hierarchical topic structures in massive text corpora. Although related studies have achieved satisfying performance in fully supervised…

Computation and Language · Computer Science 2023-10-24 Yu Zhang , Xiusi Chen , Yu Meng , Jiawei Han

XAI-CLASS: Explanation-Enhanced Text Classification with Extremely Weak Supervision

Text classification aims to effectively categorize documents into pre-defined categories. Traditional methods for text classification often rely on large amounts of manually annotated training data, making the process time-consuming and…

Computation and Language · Computer Science 2023-11-02 Daniel Hajialigol , Hanwen Liu , Xuan Wang

X-Class: Text Classification with Extremely Weak Supervision

In this paper, we explore text classification with extremely weak supervision, i.e., only relying on the surface text of class names. This is a more challenging setting than the seed-driven weak supervision, which allows a few seed words…

Computation and Language · Computer Science 2022-02-09 Zihan Wang , Dheeraj Mekala , Jingbo Shang

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

The challenging field of scene text detection requires complex data annotation, which is time-consuming and expensive. Techniques, such as weak supervision, can reduce the amount of data needed. In this paper we propose a weak supervision…

Computer Vision and Pattern Recognition · Computer Science 2022-01-14 Emanuel Metzenthin , Christian Bartz , Christoph Meinel

Avoiding Your Teacher's Mistakes: Training Neural Networks with Controlled Weak Supervision

Training deep neural networks requires massive amounts of training data, but for many tasks only limited labeled data is available. This makes weak supervision attractive, using weak or noisy signals like the output of heuristic methods or…

Machine Learning · Computer Science 2017-12-08 Mostafa Dehghani , Aliaksei Severyn , Sascha Rothe , Jaap Kamps

FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these…

Computation and Language · Computer Science 2022-12-16 Tingyu Xia , Yue Wang , Yuan Tian , Yi Chang

Towards Theoretical Understanding of Weak Supervision for Information Retrieval

Neural network approaches have recently shown to be effective in several information retrieval (IR) tasks. However, neural approaches often require large volumes of training data to perform effectively, which is not always available. To…

Information Retrieval · Computer Science 2018-06-14 Hamed Zamani , W. Bruce Croft

Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks

Text categorization is an essential task in Web content analysis. Considering the ever-evolving Web data and new emerging categories, instead of the laborious supervised setting, in this paper, we focus on the minimally-supervised setting…

Computation and Language · Computer Science 2021-02-24 Xinyang Zhang , Chenwei Zhang , Luna Xin Dong , Jingbo Shang , Jiawei Han

Weakly-supervised Text Classification Based on Keyword Graph

Weakly-supervised text classification has received much attention in recent years for it can alleviate the heavy burden of annotating massive data. Among them, keyword-driven methods are the mainstream where user-provided keywords are…

Computation and Language · Computer Science 2021-10-07 Lu Zhang , Jiandong Ding , Yi Xu , Yingyao Liu , Shuigeng Zhou

Text Classification: A Perspective of Deep Learning Methods

In recent years, with the rapid development of information on the Internet, the number of complex texts and documents has increased exponentially, which requires a deeper understanding of deep learning methods in order to accurately…

Computation and Language · Computer Science 2023-09-26 Zhongwei Wan

Language Model Pre-training for Hierarchical Document Representations

Hierarchical neural architectures are often used to capture long-distance dependencies and have been applied to many document-level tasks such as summarization, document segmentation, and sentiment analysis. However, effective usage of such…

Computation and Language · Computer Science 2019-01-29 Ming-Wei Chang , Kristina Toutanova , Kenton Lee , Jacob Devlin

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…

Machine Learning · Computer Science 2022-02-09 Chidubem Arachie , Bert Huang

Seed Word Selection for Weakly-Supervised Text Classification with Unsupervised Error Estimation

Weakly-supervised text classification aims to induce text classifiers from only a few user-provided seed words. The vast majority of previous work assumes high-quality seed words are given. However, the expert-annotated seed words are…

Computation and Language · Computer Science 2021-04-21 Yiping Jin , Akshay Bhatia , Dittaya Wanvarie

MEGClass: Extremely Weakly Supervised Text Classification via Mutually-Enhancing Text Granularities

Text classification is essential for organizing unstructured text. Traditional methods rely on human annotations or, more recently, a set of class seed words for supervision, which can be costly, particularly for specialized or emerging…

Computation and Language · Computer Science 2023-10-31 Priyanka Kargupta , Tanay Komarlu , Susik Yoon , Xuan Wang , Jiawei Han

Self-Training with Weak Supervision

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such…

Computation and Language · Computer Science 2021-04-13 Giannis Karamanolakis , Subhabrata Mukherjee , Guoqing Zheng , Ahmed Hassan Awadallah