Related papers: Pervasive Label Errors in Test Sets Destabilize Ma…

Pervasive Label Errors in Seismological Machine Learning Datasets

The recent boom in artificial intelligence and machine learning has been powered by large datasets with accurate labels, combined with algorithmic advances and efficient computing. The quality of data can be a major factor in determining…

Geophysics · Physics 2025-11-14 Albert Leonardo Aguilar Suarez , Gregory Beroza

R+R: Security Vulnerability Dataset Quality Is Critical

Large Language Models (LLMs) are of great interest in vulnerability detection and repair. The effectiveness of these models hinges on the quality of the datasets used for both training and evaluation. Our investigation reveals that a number…

Software Engineering · Computer Science 2025-03-11 Anurag Swarnim Yadav , Joseph N. Wilson

Confident Learning: Estimating Uncertainty in Dataset Labels

Learning exists in the context of data, yet notions of confidence typically focus on model predictions, not label quality. Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and…

Machine Learning · Statistics 2022-08-23 Curtis G. Northcutt , Lu Jiang , Isaac L. Chuang

Automated Classification of Model Errors on ImageNet

While the ImageNet dataset has been driving computer vision research over the past decade, significant label noise and ambiguity have made top-1 accuracy an insufficient measure of further progress. To address this, new label-sets and…

Computer Vision and Pattern Recognition · Computer Science 2024-01-08 Momchil Peychev , Mark Niklas Müller , Marc Fischer , Martin Vechev

GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks

Label errors have been found to be prevalent in popular text, vision, and audio datasets, which heavily influence the safe development and evaluation of machine learning algorithms. Despite increasing efforts towards improving the quality…

Machine Learning · Computer Science 2023-06-02 Yuwen Li , Miao Xiong , Bryan Hooi

Identifying Label Errors in Object Detection Datasets by Loss Inspection

Labeling datasets for supervised object detection is a dull and time-consuming task. Errors can be easily introduced during annotation and overlooked during review, yielding inaccurate benchmarks and performance degradation of deep neural…

Computer Vision and Pattern Recognition · Computer Science 2023-12-20 Marius Schubert , Tobias Riedlinger , Karsten Kahl , Daniel Kröll , Sebastian Schoenen , Siniša Šegvić , Matthias Rottmann

Label Noise Types and Their Effects on Deep Learning

The recent success of deep learning is mostly due to the availability of big datasets with clean annotations. However, gathering a cleanly annotated dataset is not always feasible due to practical challenges. As a result, label noise is a…

Computer Vision and Pattern Recognition · Computer Science 2020-03-25 Görkem Algan , İlkay Ulusoy

Identifying Mislabeled Instances in Classification Datasets

A key requirement for supervised machine learning is labeled training data, which is created by annotating unlabeled data with the appropriate class. Because this process can in many cases not be done by machines, labeling needs to be…

Machine Learning · Computer Science 2019-12-12 Nicolas Michael Müller , Karla Markert

When VLMs Meet Image Classification: Test Sets Renovation via Missing Label Identification

Image classification benchmark datasets such as CIFAR, MNIST, and ImageNet serve as critical tools for model evaluation. However, despite the cleaning efforts, these datasets still suffer from pervasive noisy labels and often contain…

Computer Vision and Pattern Recognition · Computer Science 2025-05-23 Zirui Pang , Haosheng Tan , Yuhan Pu , Zhijie Deng , Zhouan Shen , Keyu Hu , Jiaheng Wei

Deep Learning is Robust to Massive Label Noise

Deep neural networks trained on large supervised datasets have led to impressive results in image classification and other tasks. However, well-annotated datasets can be time-consuming and expensive to collect, lending increased interest to…

Machine Learning · Computer Science 2018-02-27 David Rolnick , Andreas Veit , Serge Belongie , Nir Shavit

Evaluating the Robustness of Test Selection Methods for Deep Neural Networks

Testing deep learning-based systems is crucial but challenging due to the required time and labor for labeling collected raw data. To alleviate the labeling effort, multiple test selection methods have been proposed where only a subset of…

Machine Learning · Computer Science 2023-08-03 Qiang Hu , Yuejun Guo , Xiaofei Xie , Maxime Cordy , Wei Ma , Mike Papadakis , Yves Le Traon

The Impact of the Single-Label Assumption in Image Recognition Benchmarking

Deep neural networks (DNNs) are typically evaluated under the assumption that each image has a single correct label. However, many images in benchmarks like ImageNet contain multiple valid labels, creating a mismatch between evaluation…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Esla Timothy Anzaku , Seyed Amir Mousavi , Arnout Van Messem , Wesley De Neve

Human-annotated label noise and their impact on ConvNets for remote sensing image scene classification

Convolutional neural networks (ConvNets) have been successfully applied to satellite image scene classification. Human-labeled training datasets are essential for ConvNets to perform accurate classification. Errors in human-annotated…

Computer Vision and Pattern Recognition · Computer Science 2024-10-28 Longkang Peng , Tao Wei , Xuehong Chen , Xiaobei Chen , Rui Sun , Luoma Wan , Jin Chen , Xiaolin Zhu

On the Role of Dataset Quality and Heterogeneity in Model Confidence

Safety-critical applications require machine learning models that output accurate and calibrated probabilities. While uncalibrated deep networks are known to make over-confident predictions, it is unclear how model confidence is impacted by…

Machine Learning · Computer Science 2020-02-25 Yuan Zhao , Jiasi Chen , Samet Oymak

Detecting Label Errors by using Pre-Trained Language Models

We show that large pre-trained language models are inherently highly capable of identifying label errors in natural language datasets: simply examining out-of-sample data points in descending order of fine-tuned task loss significantly…

Computation and Language · Computer Science 2022-12-16 Derek Chong , Jenny Hong , Christopher D. Manning

Regretful Decisions under Label Noise

Machine learning models are routinely used to support decisions that affect individuals -- be it to screen a patient for a serious illness or to gauge their response to treatment. In these tasks, we are limited to learning models from…

Machine Learning · Computer Science 2025-06-10 Sujay Nagaraj , Yang Liu , Flavio P. Calmon , Berk Ustun

NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity Recognition

Available training data for named entity recognition (NER) often contains a significant percentage of incorrect labels for entity types and entity boundaries. Such label noise poses challenges for supervised learning and may significantly…

Computation and Language · Computer Science 2024-10-15 Elena Merdjanovska , Ansar Aynetdinov , Alan Akbik

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels

Learning with noisy labels has aroused much research interest since data annotations, especially for large-scale datasets, may be inevitably imperfect. Recent approaches resort to a semi-supervised learning problem by dividing training…

Computer Vision and Pattern Recognition · Computer Science 2022-07-20 Kai Wang , Xiangyu Peng , Shuo Yang , Jianfei Yang , Zheng Zhu , Xinchao Wang , Yang You

Noisy Label Refinement with Semantically Reliable Synthetic Images

Semantic noise in image classification datasets, where visually similar categories are frequently mislabeled, poses a significant challenge to conventional supervised learning approaches. In this paper, we explore the potential of using…

Computer Vision and Pattern Recognition · Computer Science 2025-09-05 Yingxuan Li , Jiafeng Mao , Yusuke Matsui

Assessing the Quality of the Datasets by Identifying Mislabeled Samples

Due to the over-emphasize of the quantity of data, the data quality has often been overlooked. However, not all training data points contribute equally to learning. In particular, if mislabeled, it might actively damage the performance of…

Machine Learning · Computer Science 2021-09-13 Vaibhav Pulastya , Gaurav Nuti , Yash Kumar Atri , Tanmoy Chakraborty