Related papers: BERT-Flow-VAE: A Weakly-supervised Model for Multi…

QUAD-LLM-MLTC: Large Language Models Ensemble Learning for Healthcare Text Multi-Label Classification

The escalating volume of collected healthcare textual data presents a unique challenge for automated Multi-Label Text Classification (MLTC), which is primarily due to the scarcity of annotated texts for training and their nuanced nature.…

Computation and Language · Computer Science 2025-03-04 Hajar Sakai , Sarah S. Lam

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or…

Computation and Language · Computer Science 2023-09-26 Muberra Ozmen , Joseph Cotnareanu , Mark Coates

RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules

Weakly supervised text classification (WSTC), also called zero-shot or dataless text classification, has attracted increasing attention due to its applicability in classifying a mass of texts within the dynamic and open Web environment,…

Computation and Language · Computer Science 2024-04-26 Miaomiao Li , Jiaqi Zhu , Yang Wang , Yi Yang , Yilin Li , Hongan Wang

Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks

Text categorization is an essential task in Web content analysis. Considering the ever-evolving Web data and new emerging categories, instead of the laborious supervised setting, in this paper, we focus on the minimally-supervised setting…

Computation and Language · Computer Science 2021-02-24 Xinyang Zhang , Chenwei Zhang , Luna Xin Dong , Jingbo Shang , Jiawei Han

Label-Wise Document Pre-Training for Multi-Label Text Classification

A major challenge of multi-label text classification (MLTC) is to stimulatingly exploit possible label differences and label correlations. In this paper, we tackle this challenge by developing Label-Wise Pre-Training (LW-PT) method to get a…

Computation and Language · Computer Science 2020-08-18 Han Liu , Caixia Yuan , Xiaojie Wang

Lbl2Vec: An Embedding-Based Approach for Unsupervised Document Retrieval on Predefined Topics

In this paper, we consider the task of retrieving documents with predefined topics from an unlabeled document dataset using an unsupervised approach. The proposed unsupervised approach requires only a small number of keywords describing the…

Computation and Language · Computer Science 2022-10-13 Tim Schopf , Daniel Braun , Florian Matthes

Semi-Supervised Sequence Modeling with Cross-View Training

Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of unlabeled text. However, the supervised models only…

Computation and Language · Computer Science 2018-09-25 Kevin Clark , Minh-Thang Luong , Christopher D. Manning , Quoc V. Le

Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning

In multi-label text classification (MLTC), each given document is associated with a set of correlated labels. To capture label correlations, previous classifier-chain and sequence-to-sequence models transform MLTC to a sequence prediction…

Computation and Language · Computer Science 2021-06-08 Ximing Zhang , Qian-Wen Zhang , Zhao Yan , Ruifang Liu , Yunbo Cao

Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations

Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on…

Computation and Language · Computer Science 2020-10-13 Nikolaos Manginas , Ilias Chalkidis , Prodromos Malakasiotis

Weakly Supervised Label Learning Flows

Supervised learning usually requires a large amount of labelled data. However, attaining ground-truth labels is costly for many tasks. Alternatively, weakly supervised methods learn with cheap weak signals that only approximately label some…

Machine Learning · Computer Science 2024-11-26 You Lu , Wenzhuo Song , Chidubem Arachie , Bert Huang

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

BERT is inefficient for sentence-pair tasks such as clustering or semantic search as it needs to evaluate combinatorially many sentence pairs which is very time-consuming. Sentence BERT (SBERT) attempted to solve this challenge by learning…

Computation and Language · Computer Science 2021-02-08 Yan Zhang , Ruidan He , Zuozhu Liu , Kwan Hui Lim , Lidong Bing

AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification

Extreme multi-label text classification (XMTC) is an important problem in the era of big data, for tagging a given text with the most relevant multiple labels from an extremely large-scale label set. XMTC can be found in many applications,…

Computation and Language · Computer Science 2019-11-05 Ronghui You , Zihan Zhang , Ziye Wang , Suyang Dai , Hiroshi Mamitsuka , Shanfeng Zhu

Label-template based Few-Shot Text Classification with Contrastive Learning

As an algorithmic framework for learning to learn, meta-learning provides a promising solution for few-shot text classification. However, most existing research fail to give enough attention to class labels. Traditional basic framework…

Computation and Language · Computer Science 2024-12-16 Guanghua Hou , Shuhui Cao , Deqiang Ouyang , Ning Wang

ML-Net: multi-label classification of biomedical texts with deep neural networks

In multi-label text classification, each textual document can be assigned with one or more labels. Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class…

Information Retrieval · Computer Science 2019-07-02 Jingcheng Du , Qingyu Chen , Yifan Peng , Yang Xiang , Cui Tao , Zhiyong Lu

A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks

This is the first work to investigate the effectiveness of BERT-based contextual embeddings in active learning (AL) tasks on cold-start scenarios, where traditional fine-tuning is infeasible due to the absence of labeled data. Our primary…

Machine Learning · Computer Science 2024-07-25 Fabiano Belém , Washington Cunha , Celso França , Claudio Andrade , Leonardo Rocha , Marcos André Gonçalves

Large-Scale Multi-Label Text Classification on EU Legislation

We consider Large-Scale Multi-Label Text Classification (LMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, annotated with ~4.3k EUROVOC labels, which is suitable for LMTC, few- and zero-shot…

Computation and Language · Computer Science 2019-06-07 Ilias Chalkidis , Manos Fergadiotis , Prodromos Malakasiotis , Ion Androutsopoulos

LIME: Weakly-Supervised Text Classification Without Seeds

In weakly-supervised text classification, only label names act as sources of supervision. Predominant approaches to weakly-supervised text classification utilize a two-phase framework, where test samples are first assigned pseudo-labels and…

Computation and Language · Computer Science 2022-10-14 Seongmin Park , Jihwa Lee

Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers

Instead of relying on human-annotated training samples to build a classifier, weakly supervised scientific paper classification aims to classify papers only using category descriptions (e.g., category names, category-indicative keywords).…

Computation and Language · Computer Science 2023-10-24 Yu Zhang , Bowen Jin , Xiusi Chen , Yanzhen Shen , Yunyi Zhang , Yu Meng , Jiawei Han

Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based on Weakly Supervised Learning

Supervised object detection and semantic segmentation require object or even pixel level annotations. When there exist image level labels only, it is challenging for weakly supervised algorithms to achieve accurate predictions. The accuracy…

Computer Vision and Pattern Recognition · Computer Science 2018-03-06 Weifeng Ge , Sibei Yang , Yizhou Yu

A Deep Reinforced Sequence-to-Set Model for Multi-Label Text Classification

Multi-label text classification (MLTC) aims to assign multiple labels to each sample in the dataset. The labels usually have internal correlations. However, traditional methods tend to ignore the correlations between labels. In order to…

Computation and Language · Computer Science 2018-09-11 Pengcheng Yang , Shuming Ma , Yi Zhang , Junyang Lin , Qi Su , Xu Sun