Related papers: Learning from Imperfect Annotations

Learning from Multiple Expert Annotators for Enhancing Anomaly Detection in Medical Image Analysis

Building an accurate computer-aided diagnosis system based on data-driven approaches requires a large amount of high-quality labeled data. In medical imaging analysis, multiple expert annotators often produce subjective estimates about…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Khiem H. Le , Tuan V. Tran , Hieu H. Pham , Hieu T. Nguyen , Tung T. Le , Ha Q. Nguyen

Don't Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations

Researchers have raised awareness about the harms of aggregating labels especially in subjective tasks that naturally contain disagreements among human annotators. In this work we show that models that are only provided aggregated labels…

Computation and Language · Computer Science 2024-03-08 Abhishek Anand , Negar Mokhberian , Prathyusha Naresh Kumar , Anweasha Saha , Zihao He , Ashwin Rao , Fred Morstatter , Kristina Lerman

A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks

Human annotations are vital to supervised learning, yet annotators often disagree on the correct label, especially as annotation tasks increase in complexity. A strategy to improve label quality is to ask multiple annotators to label the…

Machine Learning · Computer Science 2023-12-22 Alexander Braylan , Madalyn Marabella , Omar Alonso , Matthew Lease

Learning from Crowds with Sparse and Imbalanced Annotations

Traditional supervised learning requires ground truth labels for the training data, whose collection can be difficult in many cases. Recently, crowdsourcing has established itself as an efficient labeling solution through resorting to…

Machine Learning · Computer Science 2021-07-13 Ye Shi , Shao-Yuan Li , Sheng-Jun Huang

Aggregating Soft Labels from Crowd Annotations Improves Uncertainty Estimation Under Distribution Shift

Selecting an effective training signal for machine learning tasks is difficult: expert annotations are expensive, and crowd-sourced annotations may not be reliable. Recent work has demonstrated that learning from a distribution over labels…

Computation and Language · Computer Science 2025-04-23 Dustin Wright , Isabelle Augenstein

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

Data is the engine of modern computer vision, which necessitates collecting large-scale datasets. This is expensive, and guaranteeing the quality of the labels is a major challenge. In this paper, we investigate efficient annotation…

Computer Vision and Pattern Recognition · Computer Science 2021-04-27 Yuan-Hong Liao , Amlan Kar , Sanja Fidler

Human-in-the-loop: Towards Label Embeddings for Measuring Classification Difficulty

Uncertainty in machine learning models is a timely and vast field of research. In supervised learning, uncertainty can already occur in the first stage of the training process, the annotation phase. This scenario is particularly evident…

Machine Learning · Computer Science 2024-07-24 Katharina Hechinger , Christoph Koller , Xiao Xiang Zhu , Göran Kauermann

From Ground Truth to Measurement: A Statistical Framework for Human Labeling

Supervised machine learning assumes that labeled data provide accurate measurements of the concepts models are meant to learn. Yet in practice, human labeling introduces systematic variation arising from ambiguous items, divergent…

Methodology · Statistics 2026-04-10 Robert Chew , Stephanie Eckman , Christoph Kern , Frauke Kreuter

Crowdsourcing Ground Truth for Medical Relation Extraction

Cognitive computing systems require human labeled data for evaluation, and often for training. The standard practice used in gathering this data minimizes disagreement between annotators, and we have found this results in data that fails to…

Computation and Language · Computer Science 2018-09-27 Anca Dumitrache , Lora Aroyo , Chris Welty

Beyond Agreement: Rethinking Ground Truth in Educational AI Annotation

Humans can be notoriously imperfect evaluators. They are often biased, unreliable, and unfit to define "ground truth." Yet, given the surging need to produce large amounts of training data in educational applications using AI, traditional…

Artificial Intelligence · Computer Science 2025-08-04 Danielle R. Thomas , Conrad Borchers , Kenneth R. Koedinger

Inferring ground truth from multi-annotator ordinal data: a probabilistic approach

A popular approach for large scale data annotation tasks is crowdsourcing, wherein each data point is labeled by multiple noisy annotators. We consider the problem of inferring ground truth from noisy ordinal labels obtained from multiple…

Machine Learning · Statistics 2013-05-02 Balaji Lakshminarayanan , Yee Whye Teh

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that…

Computation and Language · Computer Science 2022-09-27 Jan-Christoph Klie , Bonnie Webber , Iryna Gurevych

Learning from Crowds by Modeling Common Confusions

Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost. However, the annotation quality of annotators varies considerably, which imposes new challenges in learning a high-quality model from the…

Machine Learning · Computer Science 2021-06-15 Zhendong Chu , Jing Ma , Hongning Wang

Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations

Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of `truth inference', as individual workers cannot be wholly trusted to provide reliable…

Machine Learning · Computer Science 2019-02-26 Yuan Li , Benjamin I. P. Rubinstein , Trevor Cohn

Is one annotation enough? A data-centric image classification benchmark for noisy and ambiguous label estimation

High-quality data is necessary for modern machine learning. However, the acquisition of such data is difficult due to noisy and ambiguous annotations of humans. The aggregation of such annotations to determine the label of an image leads to…

Computer Vision and Pattern Recognition · Computer Science 2022-11-07 Lars Schmarje , Vasco Grossmann , Claudius Zelenka , Sabine Dippel , Rainer Kiko , Mariusz Oszust , Matti Pastell , Jenny Stracke , Anna Valros , Nina Volkmann , Reinhard Koch

Learning with Different Amounts of Annotation: From Zero to Many Labels

Training NLP systems typically assumes access to annotated data that has a single human label per example. Given imperfect labeling from annotators and inherent ambiguity of language, we hypothesize that single label is not sufficient to…

Computation and Language · Computer Science 2021-09-14 Shujian Zhang , Chengyue Gong , Eunsol Choi

Minority Reports: Balancing Cost and Quality in Ground Truth Data Annotation

High-quality data annotation is an essential but laborious and costly aspect of developing machine learning-based software. We explore the inherent tradeoff between annotation accuracy and cost by detecting and removing minority reports --…

Machine Learning · Computer Science 2025-04-15 Hsuan Wei Liao , Christopher Klugmann , Daniel Kondermann , Rafid Mahmood

Clean or Annotate: How to Spend a Limited Data Collection Budget

Crowdsourcing platforms are often used to collect datasets for training machine learning models, despite higher levels of inaccurate labeling compared to expert labeling. There are two common strategies to manage the impact of such noise.…

Computation and Language · Computer Science 2022-06-14 Derek Chen , Zhou Yu , Samuel R. Bowman

Joint Multi-Dimensional Model for Global and Time-Series Annotations

Crowdsourcing is a popular approach to collect annotations for unlabeled data instances. It involves collecting a large number of annotations from several, often naive untrained annotators for each data instance which are then combined to…

Machine Learning · Computer Science 2020-05-08 Anil Ramakrishna , Rahul Gupta , Shrikanth Narayanan

Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling

Sequence labeling is a fundamental framework for various natural language processing problems. Its performance is largely influenced by the annotation quality and quantity in supervised learning scenarios, and obtaining ground truth labels…

Computation and Language · Computer Science 2020-04-17 Ouyu Lan , Xiao Huang , Bill Yuchen Lin , He Jiang , Liyuan Liu , Xiang Ren