Related papers: Pervasive Label Errors in Test Sets Destabilize Ma…
The recent boom in artificial intelligence and machine learning has been powered by large datasets with accurate labels, combined with algorithmic advances and efficient computing. The quality of data can be a major factor in determining…
Large Language Models (LLMs) are of great interest in vulnerability detection and repair. The effectiveness of these models hinges on the quality of the datasets used for both training and evaluation. Our investigation reveals that a number…
Learning exists in the context of data, yet notions of confidence typically focus on model predictions, not label quality. Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and…
While the ImageNet dataset has been driving computer vision research over the past decade, significant label noise and ambiguity have made top-1 accuracy an insufficient measure of further progress. To address this, new label-sets and…
Label errors have been found to be prevalent in popular text, vision, and audio datasets, which heavily influence the safe development and evaluation of machine learning algorithms. Despite increasing efforts towards improving the quality…
Labeling datasets for supervised object detection is a dull and time-consuming task. Errors can be easily introduced during annotation and overlooked during review, yielding inaccurate benchmarks and performance degradation of deep neural…
The recent success of deep learning is mostly due to the availability of big datasets with clean annotations. However, gathering a cleanly annotated dataset is not always feasible due to practical challenges. As a result, label noise is a…
A key requirement for supervised machine learning is labeled training data, which is created by annotating unlabeled data with the appropriate class. Because this process can in many cases not be done by machines, labeling needs to be…
Image classification benchmark datasets such as CIFAR, MNIST, and ImageNet serve as critical tools for model evaluation. However, despite the cleaning efforts, these datasets still suffer from pervasive noisy labels and often contain…
Deep neural networks trained on large supervised datasets have led to impressive results in image classification and other tasks. However, well-annotated datasets can be time-consuming and expensive to collect, lending increased interest to…
Testing deep learning-based systems is crucial but challenging due to the required time and labor for labeling collected raw data. To alleviate the labeling effort, multiple test selection methods have been proposed where only a subset of…
Deep neural networks (DNNs) are typically evaluated under the assumption that each image has a single correct label. However, many images in benchmarks like ImageNet contain multiple valid labels, creating a mismatch between evaluation…
Convolutional neural networks (ConvNets) have been successfully applied to satellite image scene classification. Human-labeled training datasets are essential for ConvNets to perform accurate classification. Errors in human-annotated…
Safety-critical applications require machine learning models that output accurate and calibrated probabilities. While uncalibrated deep networks are known to make over-confident predictions, it is unclear how model confidence is impacted by…
We show that large pre-trained language models are inherently highly capable of identifying label errors in natural language datasets: simply examining out-of-sample data points in descending order of fine-tuned task loss significantly…
Machine learning models are routinely used to support decisions that affect individuals -- be it to screen a patient for a serious illness or to gauge their response to treatment. In these tasks, we are limited to learning models from…
Available training data for named entity recognition (NER) often contains a significant percentage of incorrect labels for entity types and entity boundaries. Such label noise poses challenges for supervised learning and may significantly…
Learning with noisy labels has aroused much research interest since data annotations, especially for large-scale datasets, may be inevitably imperfect. Recent approaches resort to a semi-supervised learning problem by dividing training…
Semantic noise in image classification datasets, where visually similar categories are frequently mislabeled, poses a significant challenge to conventional supervised learning approaches. In this paper, we explore the potential of using…
Due to the over-emphasize of the quantity of data, the data quality has often been overlooked. However, not all training data points contribute equally to learning. In particular, if mislabeled, it might actively damage the performance of…