Related papers: Regularizing Model Complexity and Label Structure …

Inducing Generalized Multi-Label Rules with Learning Classifier Systems

In recent years, multi-label classification has attracted a significant body of research, motivated by real-life applications, such as text classification and medical diagnoses. Although sparsely studied in this context, Learning Classifier…

Neural and Evolutionary Computing · Computer Science 2015-12-29 Fani A. Tzima , Miltiadis Allamanis , Alexandros Filotheou , Pericles A. Mitkas

Comparing effectiveness of regularization methods on text classification: Simple and complex model in data shortage situation

Text classification is the task of assigning a document to a predefined class. However, it is expensive to acquire enough labeled documents or to label them. In this paper, we study the regularization methods' effects on various…

Computation and Language · Computer Science 2024-03-05 Jongga Lee , Jaeseung Yim , Seohee Park , Changwon Lim

On Regularization and Inference with Label Constraints

Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems. In this work, we compare two common strategies for encoding label constraints in a…

Machine Learning · Computer Science 2023-07-11 Kaifu Wang , Hangfeng He , Tin D. Nguyen , Piyush Kumar , Dan Roth

Multi-Label Learning with Provable Guarantee

Here we study the problem of learning labels for large text corpora where each text can be assigned a variable number of labels. The problem might seem trivial when the label dimensionality is small and can be easily solved using a series…

Machine Learning · Computer Science 2016-11-02 Sayantan Dasgupta

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or…

Computation and Language · Computer Science 2023-09-26 Muberra Ozmen , Joseph Cotnareanu , Mark Coates

Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias and Variance

The study of model bias and variance with respect to decision boundaries is critically important in supervised classification. There is generally a tradeoff between the two, as fine-tuning of the decision boundary of a classification model…

Machine Learning · Computer Science 2020-02-25 Matthew Almeida , Wei Ding , Scott Crouter , Ping Chen

Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts

Noise in data appears to be inevitable in most real-world machine learning applications and would cause severe overfitting problems. Not only can data features contain noise, but labels are also prone to be noisy due to human input. In this…

Machine Learning · Computer Science 2025-05-09 Weipeng Huang , Qin Li , Yang Xiao , Cheng Qiao , Tie Cai , Junwei Liang , Neil J. Hurley , Guangyuan Piao

Concise and interpretable multi-label rule sets

Multi-label classification is becoming increasingly ubiquitous, but not much attention has been paid to interpretability. In this paper, we develop a multi-label classifier that can be represented as a concise set of simple "if-then" rules,…

Machine Learning · Computer Science 2022-11-09 Martino Ciaperoni , Han Xiao , Aristides Gionis

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a…

Computation and Language · Computer Science 2019-09-11 Jiawei Wu , Wenhan Xiong , William Yang Wang

Towards Robustness to Label Noise in Text Classification via Noise Modeling

Large datasets in NLP suffer from noisy labels, due to erroneous automatic and human annotation procedures. We study the problem of text classification with label noise, and aim to capture this noise through an auxiliary noise model over…

Computation and Language · Computer Science 2022-06-22 Siddhant Garg , Goutham Ramakrishnan , Varun Thumbe

Regularizing Neural Networks by Penalizing Confident Output Distributions

We systematically explore regularizing neural networks by penalizing low entropy output distributions. We show that penalizing low entropy output distributions, which has been shown to improve exploration in reinforcement learning, acts as…

Neural and Evolutionary Computing · Computer Science 2017-01-24 Gabriel Pereyra , George Tucker , Jan Chorowski , Łukasz Kaiser , Geoffrey Hinton

ML-Net: multi-label classification of biomedical texts with deep neural networks

In multi-label text classification, each textual document can be assigned with one or more labels. Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class…

Information Retrieval · Computer Science 2019-07-02 Jingcheng Du , Qingyu Chen , Yifan Peng , Yang Xiang , Cui Tao , Zhiyong Lu

Adapting RNN Sequence Prediction Model to Multi-label Set Prediction

We present an adaptation of RNN sequence models to the problem of multi-label classification for text, where the target is a set of labels, not a sequence. Previous such RNN models define probabilities for sequences but not for sets;…

Computation and Language · Computer Science 2019-04-12 Kechen Qin , Cheng Li , Virgil Pavlu , Javed A. Aslam

One Size Does Not Fit All: Exploring Variable Thresholds for Distance-Based Multi-Label Text Classification

Distance-based unsupervised text classification is a method within text classification that leverages the semantic similarity between a label and a text to determine label relevance. This method provides numerous benefits, including fast…

Computation and Language · Computer Science 2025-10-14 Jens Van Nooten , Andriy Kosar , Guy De Pauw , Walter Daelemans

In real-world applications, as data availability increases, obtaining labeled data for machine learning (ML) projects remains challenging due to the high costs and intensive efforts required for data annotation. Many ML projects,…

Machine Learning · Computer Science 2024-12-24 Ismail Hakki Karaman , Gulser Koksal , Levent Eriskin , Salih Salihoglu

Gradient-based Label Binning in Multi-label Classification

In multi-label classification, where a single example may be associated with several class labels at the same time, the ability to model dependencies between labels is considered crucial to effectively optimize non-decomposable evaluation…

Machine Learning · Computer Science 2021-06-23 Michael Rapp , Eneldo Loza Mencía , Johannes Fürnkranz , Eyke Hüllermeier

Regularization via Structural Label Smoothing

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network…

Machine Learning · Computer Science 2020-07-07 Weizhi Li , Gautam Dasarathy , Visar Berisha

Adaptive Regularization of Labels

Recently, a variety of regularization techniques have been widely applied in deep neural networks, such as dropout, batch normalization, data augmentation, and so on. These methods mainly focus on the regularization of weight parameters to…

Machine Learning · Computer Science 2019-08-16 Qianggang Ding , Sifan Wu , Hao Sun , Jiadong Guo , Shu-Tao Xia

Robust Neural Network Classification via Double Regularization

The presence of mislabeled observations in data is a notoriously challenging problem in statistics and machine learning, associated with poor generalization properties for both traditional classifiers and, perhaps even more so, flexible…

Machine Learning · Statistics 2022-02-09 Olof Zetterqvist , Rebecka Jörnsten , Johan Jonasson

Robustness and Reliability When Training With Noisy Labels

Labelling of data for supervised learning can be costly and time-consuming and the risk of incorporating label noise in large data sets is imminent. When training a flexible discriminative model using a strictly proper loss, such noise will…

Machine Learning · Statistics 2022-05-13 Amanda Olmin , Fredrik Lindsten