Related papers: Robust Optimal Classification Trees under Noisy La…
In this paper we present a novel mathematical optimization-based methodology to construct tree-shaped classification rules for multiclass instances. Our approach consists of building Classification Trees in which, except for the leaf nodes,…
We consider the problem of training a model under the presence of label noise. Current approaches identify samples with potentially incorrect labels and reduce their influence on the learning process by either assigning lower weights to…
In this paper we propose novel methodologies to construct Support Vector Machine -based classifiers that takes into account that label noises occur in the training sample. We propose different alternatives based on solving Mixed Integer…
In most practical problems of classifier learning, the training data suffers from the label noise. Hence, it is important to understand how robust is a learning algorithm to such label noise. This paper presents some theoretical analysis to…
In the real world, data is often noisy, affecting not only the quality of features but also the accuracy of labels. Current research on mitigating label errors stems primarily from advances in deep learning, and a gap exists in exploring…
Decision trees are one of the most popular methods for solving classification problems, mainly because of their good interpretability properties. Moreover, due to advances in recent years in mixed-integer optimization, several models have…
Large datasets in NLP suffer from noisy labels, due to erroneous automatic and human annotation procedures. We study the problem of text classification with label noise, and aim to capture this noise through an auxiliary noise model over…
Labor-intensive labeling becomes a bottleneck in developing computer vision algorithms based on deep learning. For this reason, dealing with imperfect labels has increasingly gained attention and has become an active field of study. We…
We study the problem of multiple kernel learning from noisy labels. This is in contrast to most of the previous studies on multiple kernel learning that mainly focus on developing efficient algorithms and assume perfectly labeled training…
For multi-class classification under class-conditional label noise, we prove that the accuracy metric itself can be robust. We concretize this finding's inspiration in two essential aspects: training and validation, with which we address…
To collect large scale annotated data, it is inevitable to introduce label noise, i.e., incorrect class labels. To be robust against label noise, many successful methods rely on the noisy classifiers (i.e., models trained on the noisy…
Collecting large-scale data with clean labels for supervised training of neural networks is practically challenging. Although noisy labels are usually cheap to acquire, existing methods suffer a lot from label noise. This paper targets at…
Previous research has shown that constraining the gradient of loss function with respect to model-predicted probabilities can enhance the model robustness against noisy labels. These methods typically specify a fixed optimal threshold for…
Supervised learning of deep neural networks heavily relies on large-scale datasets annotated by high-quality labels. In contrast, mislabeled samples can significantly degrade the generalization of models and result in memorizing samples,…
Deep learning has achieved remarkable success in numerous domains with help from large amounts of big data. However, the quality of data labels is a concern because of the lack of high-quality labels in many real-world scenarios. As noisy…
Multi-label image classification has generated significant interest in recent years and the performance of such systems often suffers from the not so infrequent occurrence of incorrect or missing labels in the training data. In this paper,…
Labelling of data for supervised learning can be costly and time-consuming and the risk of incorporating label noise in large data sets is imminent. When training a flexible discriminative model using a strictly proper loss, such noise will…
The performance of supervised classification techniques often deteriorates when the data has noisy labels. Even the semi-supervised classification approaches have largely focused only on the problem of handling missing labels. Most of the…
In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations,…
In pool-based active learning, the learner is given an unlabeled data set and aims to efficiently learn the unknown hypothesis by querying the labels of the data points. This can be formulated as the classical Optimal Decision Tree (ODT)…