English
Related papers

Related papers: Pairwise Instance Relation Augmentation for Long-t…

200 papers

Multi-label text classification (MLTC) aims to annotate documents with the most relevant labels from a number of candidate labels. In real applications, the distribution of label frequency often exhibits a long tail, i.e., a few labels are…

Computation and Language · Computer Science 2021-01-26 Lin Xiao , Xiangliang Zhang , Liping Jing , Chi Huang , Mingyang Song

In multi-label text classification (MLTC), each given document is associated with a set of correlated labels. To capture label correlations, previous classifier-chain and sequence-to-sequence models transform MLTC to a sequence prediction…

Computation and Language · Computer Science 2021-06-08 Ximing Zhang , Qian-Wen Zhang , Zhao Yan , Ruifang Liu , Yunbo Cao

Multi-label text classification (MLC) is a challenging task in settings of large label sets, where label support follows a Zipfian distribution. In this paper, we address this problem through retrieval augmentation, aiming to improve the…

Computation and Language · Computer Science 2023-05-23 Ilias Chalkidis , Yova Kementchedjhieva

Wrong labeling problem and long-tail relations are two main challenges caused by distant supervision in relation extraction. Recent works alleviate the wrong labeling by selective attention via multi-instance learning, but cannot well…

Computation and Language · Computer Science 2020-11-03 Yang Li , Tao Shen , Guodong Long , Jing Jiang , Tianyi Zhou , Chengqi Zhang

Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications and presents interesting challenges. First, not all labels are well represented in the training set, due to the very large…

Computation and Language · Computer Science 2020-10-06 Ilias Chalkidis , Manos Fergadiotis , Sotiris Kotitsas , Prodromos Malakasiotis , Nikolaos Aletras , Ion Androutsopoulos

Legal text classification is a fundamental NLP task in the legal domain. Benchmark datasets in this area often exhibit a long-tail label distribution, where many labels are underrepresented, leading to poor model performance on rare…

Computation and Language · Computer Science 2025-09-01 Boheng Mao

Mixup is a popular data augmentation method, with many variants subsequently proposed. These methods mainly create new examples via convex combination of random data pairs and their corresponding one-hot labels. However, most of them adhere…

Computer Vision and Pattern Recognition · Computer Science 2021-10-12 Shaoyu Zhang , Chen Chen , Xiujuan Zhang , Silong Peng

Conventional multi-label classification (MLC) methods assume that all samples are fully labeled and identically distributed. Unfortunately, this assumption is unrealistic in large-scale MLC data that has long-tailed (LT) distribution and…

Machine Learning · Computer Science 2023-04-24 Wenqiao Zhang , Changshuo Liu , Lingze Zeng , Beng Chin Ooi , Siliang Tang , Yueting Zhuang

Imbalanced music genre classification is a crucial task in the Music Information Retrieval (MIR) field for identifying the long-tail, data-poor genre based on the related music audio segments, which is very prevalent in real-world…

Sound · Computer Science 2022-09-12 Xiaokai Liu , Menghua Zhang

Learning an effective representation in multi-label text classification (MLTC) is a significant challenge in NLP. This challenge arises from the inherent complexity of the task, which is shaped by two key factors: the intricate connections…

Machine Learning · Computer Science 2024-04-16 Alexandre Audibert , Aurélien Gauffre , Massih-Reza Amini

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or…

Computation and Language · Computer Science 2023-09-26 Muberra Ozmen , Joseph Cotnareanu , Mark Coates

Motivation: Despite recent advancements in semantic representation driven by pre-trained and large-scale language models, addressing long tail challenges in multi-label text classification remains a significant issue. Long tail challenges…

Computation and Language · Computer Science 2025-03-12 Yan Yan , Junyuan Liu , Bo-Wen Zhang

Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes. It causes severe biases of the head classes (with majority samples) against the tailed ones. This renders "how to…

Computer Vision and Pattern Recognition · Computer Science 2022-04-05 Yin-Yin He , Peizhen Zhang , Xiu-Shen Wei , Xiangyu Zhang , Jian Sun

Real-world data usually present long-tailed distributions. Training on imbalanced data tends to render neural networks perform well on head classes while much worse on tail classes. The severe sparseness of training instances for the tail…

Machine Learning · Computer Science 2021-11-10 Chaozheng Wang , Shuzheng Gao , Cuiyun Gao , Pengyun Wang , Wenjie Pei , Lujia Pan , Zenglin Xu

Long-tail class incremental learning (LT CIL) remains highly challenging because the scarcity of samples in tail classes not only hampers their learning but also exacerbates catastrophic forgetting under continuously evolving and imbalanced…

Artificial Intelligence · Computer Science 2026-03-24 Xi Wang , Xu Yang , Donghao Sun , Cheng Deng

Extreme multi-label text classification (XMTC) aims at tagging a document with most relevant labels from an extremely large-scale label set. It is a challenging problem especially for the tail labels because there are only few training…

Machine Learning · Computer Science 2019-07-15 Xin Huang , Boli Chen , Lin Xiao , Liping Jing

Data in real-world object detection often exhibits the long-tailed distribution. Existing solutions tackle this problem by mitigating the competition between the head and tail categories. However, due to the scarcity of training samples,…

Computer Vision and Pattern Recognition · Computer Science 2022-10-12 Bo Li , Yongqiang Yao , Jingru Tan , Xin Lu , Fengwei Yu , Ye Luo , Jianwei Lu

In Multi-Label Text Classification (MLTC), one sample can belong to more than one class. It is observed that most MLTC tasks, there are dependencies or correlations among labels. Existing methods tend to ignore the relationship among…

Computation and Language · Computer Science 2020-03-27 Ankit Pal , Muru Selvakumar , Malaikannan Sankarasubbu

Real-world data often follows a long-tailed distribution, which makes the performance of existing classification algorithms degrade heavily. A key issue is that samples in tail categories fail to depict their intra-class diversity. Humans…

Computer Vision and Pattern Recognition · Computer Science 2022-02-14 Xiaohua Chen , Yucan Zhou , Dayan Wu , Wanqian Zhang , Yu Zhou , Bo Li , Weiping Wang

Traditional supervised learning heavily relies on human-annotated datasets, especially in data-hungry neural approaches. However, various tasks, especially multi-label tasks like document-level relation extraction, pose challenges in fully…

Computation and Language · Computer Science 2024-06-25 Zixia Jia , Junpeng Li , Shichuan Zhang , Anji Liu , Zilong Zheng
‹ Prev 1 2 3 10 Next ›