Related papers: Retrieval-augmented Multi-label Text Classificatio…

Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning

In multi-label text classification (MLTC), each given document is associated with a set of correlated labels. To capture label correlations, previous classifier-chain and sequence-to-sequence models transform MLTC to a sequence prediction…

Computation and Language · Computer Science 2021-06-08 Ximing Zhang , Qian-Wen Zhang , Zhao Yan , Ruifang Liu , Yunbo Cao

On Data Augmentation for Extreme Multi-label Classification

In this paper, we focus on data augmentation for the extreme multi-label classification (XMC) problem. One of the most challenging issues of XMC is the long tail label distribution where even strong models suffer from insufficient…

Computation and Language · Computer Science 2020-09-24 Danqing Zhang , Tao Li , Haiyang Zhang , Bing Yin

Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free

Multi-label legal annotation requires assigning multiple labels from large, evolving taxonomies to long, fact-intensive documents, often under limited supervision. Parametric encoders typically require task-specific training and retraining…

Computation and Language · Computer Science 2026-05-19 Li Zhang , Jaromir Savelka , Kevin Ashley

Improving the Accuracy and Efficiency of Legal Document Tagging with Large Language Models and Instruction Prompts

Legal multi-label classification is a critical task for organizing and accessing the vast amount of legal documentation. Despite its importance, it faces challenges such as the complexity of legal language, intricate label dependencies, and…

Computation and Language · Computer Science 2025-04-15 Emily Johnson , Xavier Holt , Noah Wilson

LiGCN: Label-interpretable Graph Convolutional Networks for Multi-label Text Classification

Multi-label text classification (MLTC) is an attractive and challenging task in natural language processing (NLP). Compared with single-label text classification, MLTC has a wider range of applications in practice. In this paper, we propose…

Computation and Language · Computer Science 2022-05-24 Irene Li , Aosong Feng , Hao Wu , Tianxiao Li , Toyotaro Suzumura , Ruihai Dong

In real-world applications, as data availability increases, obtaining labeled data for machine learning (ML) projects remains challenging due to the high costs and intensive efforts required for data annotation. Many ML projects,…

Machine Learning · Computer Science 2024-12-24 Ismail Hakki Karaman , Gulser Koksal , Levent Eriskin , Salih Salihoglu

A Multi-Task Embedder For Retrieval Augmented LLMs

LLMs confront inherent limitations in terms of its knowledge, memory, and action. The retrieval augmentation stands as a vital mechanism to address these limitations, which brings in useful information from external sources to augment the…

Information Retrieval · Computer Science 2026-01-06 Peitian Zhang , Shitao Xiao , Zheng Liu , Zhicheng Dou , Jian-Yun Nie

Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution

Multi-label text classification is a challenging task because it requires capturing label dependencies. It becomes even more challenging when class distribution is long-tailed. Resampling and re-weighting are common approaches used for…

Computation and Language · Computer Science 2021-10-19 Yi Huang , Buse Giledereli , Abdullatif Köksal , Arzucan Özgür , Elif Ozkirimli

Incorporating Multiple Cluster Centers for Multi-Label Learning

Multi-label learning deals with the problem that each instance is associated with multiple labels simultaneously. Most of the existing approaches aim to improve the performance of multi-label learning by exploiting label correlations.…

Machine Learning · Computer Science 2022-01-19 Senlin Shu , Fengmao Lv , Yan Yan , Li Li , Shuo He , Jun He

Exploring Selective Retrieval-Augmentation for Long-Tail Legal Text Classification

Legal text classification is a fundamental NLP task in the legal domain. Benchmark datasets in this area often exhibit a long-tail label distribution, where many labels are underrepresented, leading to poor model performance on rare…

Computation and Language · Computer Science 2025-09-01 Boheng Mao

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or…

Computation and Language · Computer Science 2023-09-26 Muberra Ozmen , Joseph Cotnareanu , Mark Coates

Deep Multi Label Classification in Affine Subspaces

Multi-label classification (MLC) problems are becoming increasingly popular in the context of medical imaging. This has in part been driven by the fact that acquiring annotations for MLC is far less burdensome than for semantic segmentation…

Computer Vision and Pattern Recognition · Computer Science 2019-07-11 Thomas Kurmann , Pablo Marquez Neila , Sebastian Wolf , Raphael Sznitman

Large Language Models Do Multi-Label Classification Differently

Multi-label classification is prevalent in real-world settings, but the behavior of Large Language Models (LLMs) in this setting is understudied. We investigate how autoregressive LLMs perform multi-label classification, focusing on…

Computation and Language · Computer Science 2025-11-12 Marcus Ma , Georgios Chochlakis , Niyantha Maruthu Pandiyan , Jesse Thomason , Shrikanth Narayanan

Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs

This paper proposes a Clustering, Labeling, then Augmenting framework that significantly enhances performance in Semi-Supervised Text Classification (SSTC) tasks, effectively addressing the challenge of vast datasets with limited labeled…

Computation and Language · Computer Science 2024-12-30 Shan Zhong , Jiahao Zeng , Yongxin Yu , Bohong Lin

Statistical Topic Models for Multi-Label Document Classification

Machine learning approaches to multi-label document classification have to date largely relied on discriminative modeling techniques such as support vector machines. A drawback of these approaches is that performance rapidly drops off as…

Machine Learning · Statistics 2011-11-11 Timothy N. Rubin , America Chambers , Padhraic Smyth , Mark Steyvers

Can Large Language Models Serve as Effective Classifiers for Hierarchical Multi-Label Classification of Scientific Documents at Industrial Scale?

We address the task of hierarchical multi-label classification (HMC) of scientific documents at an industrial scale, where hundreds of thousands of documents must be classified across thousands of dynamic labels. The rapid growth of…

Artificial Intelligence · Computer Science 2024-12-09 Seyed Amin Tabatabaei , Sarah Fancher , Michael Parsons , Arian Askari

ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science

Large language models record impressive performance on many natural language processing tasks. However, their knowledge capacity is limited to the pretraining corpus. Retrieval augmentation offers an effective solution by retrieving context…

Computation and Language · Computer Science 2023-11-22 Sai Munikoti , Anurag Acharya , Sridevi Wagle , Sameera Horawalavithana

Imbalanced multi-label classification using multi-task learning with extractive summarization

Extractive summarization and imbalanced multi-label classification often require vast amounts of training data to avoid overfitting. In situations where training data is expensive to generate, leveraging information between tasks is an…

Computation and Language · Computer Science 2019-03-19 John Brandt

Compositional Generalization for Multi-label Text Classification: A Data-Augmentation Approach

Despite significant advancements in multi-label text classification, the ability of existing models to generalize to novel and seldom-encountered complex concepts, which are compositions of elementary ones, remains underexplored. This…

Computation and Language · Computer Science 2023-12-21 Yuyang Chai , Zhuang Li , Jiahui Liu , Lei Chen , Fei Li , Donghong Ji , Chong Teng

Retrieval-augmented in-context learning for multimodal large language models in disease classification

Objectives: We aim to dynamically retrieve informative demonstrations, enhancing in-context learning in multimodal large language models (MLLMs) for disease classification. Methods: We propose a Retrieval-Augmented In-Context Learning…

Artificial Intelligence · Computer Science 2025-05-06 Zaifu Zhan , Shuang Zhou , Xiaoshan Zhou , Yongkang Xiao , Jun Wang , Jiawen Deng , He Zhu , Yu Hou , Rui Zhang