Related papers: Modeling and mitigating human annotation errors to…

Modeling Human Annotation Errors to Design Bias-Aware Systems for Social Stream Processing

High-quality human annotations are necessary to create effective machine learning systems for social media. Low-quality human annotations indirectly contribute to the creation of inaccurate or biased learning systems. We show that human…

Social and Information Networks · Computer Science 2019-07-18 Rahul Pandey , Carlos Castillo , Hemant Purohit

Hands-On Tutorial: Labeling with LLM and Human-in-the-Loop

Training and deploying machine learning models relies on a large amount of human-annotated data. As human labeling becomes increasingly expensive and time-consuming, recent research has developed multiple strategies to speed up annotation…

Computation and Language · Computer Science 2025-01-28 Ekaterina Artemova , Akim Tsvigun , Dominik Schlechtweg , Natalia Fedorova , Konstantin Chernyshev , Sergei Tilga , Boris Obmoroshev

Human-in-the-Loop Annotation for Image-Based Engagement Estimation: Assessing the Impact of Model Reliability on Annotation Accuracy

Human-in-the-loop (HITL) frameworks are increasingly recognized for their potential to improve annotation accuracy in emotion estimation systems by combining machine predictions with human expertise. This study focuses on integrating a…

Human-Computer Interaction · Computer Science 2025-06-10 Sahana Yadnakudige Subramanya , Ko Watanabe , Andreas Dengel , Shoya Ishimaru

Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development

This paper introduces a human-in-the-loop (HITL) data annotation pipeline to generate high-quality, large-scale speech datasets. The pipeline combines human and machine advantages to more quickly, accurately, and cost-effectively annotate…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-06 Mingkuan Liu , Chi Zhang , Hua Xing , Chao Feng , Monchu Chen , Judith Bishop , Grace Ngapo

Reliable Annotations with Less Effort: Evaluating LLM-Human Collaboration in Search Clarifications

Despite growing interest in using large language models (LLMs) to automate annotation, their effectiveness in complex, nuanced, and multi-dimensional labelling tasks remains relatively underexplored. This study focuses on annotation for the…

Information Retrieval · Computer Science 2025-07-02 Leila Tavakoli , Hamed Zamani

Keeping Humans in the Loop: Human-Centered Automated Annotation with Generative AI

Automated text annotation is a compelling use case for generative large language models (LLMs) in social media research. Recent work suggests that LLMs can achieve strong performance on annotation tasks; however, these studies evaluate LLMs…

Computation and Language · Computer Science 2024-09-24 Nicholas Pangakis , Samuel Wolken

Generalizable Error Modeling for Human Data Annotation: Evidence From an Industry-Scale Search Data Annotation Program

Machine learning (ML) and artificial intelligence (AI) systems rely heavily on human-annotated data for training and evaluation. A major challenge in this context is the occurrence of annotation errors, as their effects can degrade model…

Machine Learning · Computer Science 2024-09-27 Heinrich Peters , Alireza Hashemi , James Rae

Beyond Labels: Information-Efficient Human-in-the-Loop Learning using Ranking and Selection Queries

Integrating human expertise into machine learning systems often reduces the role of experts to labeling oracles, a paradigm that limits the amount of information exchanged and fails to capture the nuances of human judgment. We address this…

Human-Computer Interaction · Computer Science 2026-02-18 Belén Martín-Urcelay , Yoonsang Lee , Matthieu R. Bloch , Christopher J. Rozell

Improving the Efficiency of Human-in-the-Loop Systems: Adding Artificial to Human Experts

Information systems increasingly leverage artificial intelligence (AI) and machine learning (ML) to generate value from vast amounts of data. However, ML models are imperfect and can generate incorrect classifications. Hence,…

Machine Learning · Computer Science 2023-07-10 Johannes Jakubik , Daniel Weber , Patrick Hemmer , Michael Vössing , Gerhard Satzger

Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

In the context of text classification, the financial burden of annotation exercises for creating training data is a critical issue. Active learning techniques, particularly those rooted in uncertainty sampling, offer a cost-effective…

Computation and Language · Computer Science 2024-06-19 Hamidreza Rouzegar , Masoud Makrehchi

Model-in-the-Loop (MILO): Accelerating Multimodal AI Data Annotation with LLMs

The growing demand for AI training data has transformed data annotation into a global industry, but traditional approaches relying on human annotators are often time-consuming, labor-intensive, and prone to inconsistent quality. We propose…

Human-Computer Interaction · Computer Science 2024-09-25 Yifan Wang , David Stevens , Pranay Shah , Wenwen Jiang , Miao Liu , Xu Chen , Robert Kuo , Na Li , Boying Gong , Daniel Lee , Jiabo Hu , Ning Zhang , Bob Kamma

A Human-ML Collaboration Framework for Improving Video Content Reviews

We deal with the problem of localized in-video taxonomic human annotation in the video content moderation domain, where the goal is to identify video segments that violate granular policies, e.g., community guidelines on an online video…

Machine Learning · Computer Science 2022-10-19 Meghana Deodhar , Xiao Ma , Yixin Cai , Alex Koes , Alex Beutel , Jilin Chen

Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts

Modeling complex subjective tasks in Natural Language Processing, such as recognizing emotion and morality, is considerably challenging due to significant variation in human annotations. This variation often reflects reasonable differences…

Computation and Language · Computer Science 2025-11-12 Georgios Chochlakis , Peter Wu , Arjun Bedi , Marcus Ma , Kristina Lerman , Shrikanth Narayanan

How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators

Human-annotated preference data play an important role in aligning large language models (LLMs). In this paper, we study two connected questions: how to monitor the quality of human preference annotators and how to incentivize them to…

Machine Learning · Computer Science 2026-04-08 Shang Liu , Hanzhao Wang , Zhongyao Ma , Xiaocheng Li

Efficient Failure Pattern Identification of Predictive Algorithms

Given a (machine learning) classifier and a collection of unlabeled data, how can we efficiently identify misclassification patterns presented in this dataset? To address this problem, we propose a human-machine collaborative framework that…

Machine Learning · Computer Science 2023-12-20 Bao Nguyen , Viet Anh Nguyen

Augmenting Image Annotation: A Human-LMM Collaborative Framework for Efficient Object Selection and Label Generation

Traditional image annotation tasks rely heavily on human effort for object selection and label assignment, making the process time-consuming and prone to decreased efficiency as annotators experience fatigue after extensive work. This paper…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 He Zhang , Xinyi Fu , John M. Carroll

MCAL: Minimum Cost Human-Machine Active Labeling

Today, ground-truth generation uses data sets annotated by cloud-based annotation services. These services rely on human annotation, which can be prohibitively expensive. In this paper, we consider the problem of hybrid human-machine…

Machine Learning · Computer Science 2023-02-28 Hang Qiu , Krishna Chintalapudi , Ramesh Govindan

Few-Shot Point Cloud Region Annotation with Human in the Loop

We propose a point cloud annotation framework that employs human-in-loop learning to enable the creation of large point cloud datasets with per-point annotations. Sparse labels from a human annotator are iteratively propagated to generate a…

Computer Vision and Pattern Recognition · Computer Science 2019-06-12 Siddhant Jain , Sowmya Munukutla , David Held

Culturally Adaptive Explainable LLM Assessment for Multilingual Information Disorder: A Human-in-the-Loop Approach

Recognizing information disorder is difficult because judgments about manipulation depend on cultural and linguistic context. Yet current Large Language Models (LLMs) often behave as monocultural, English-centric "black boxes," producing…

Computation and Language · Computer Science 2026-03-31 Maziar Kianimoghadam Jouneghani

Active Learning for NLP with Large Language Models

Human annotation of training samples is expensive, laborious, and sometimes challenging, especially for Natural Language Processing (NLP) tasks. To reduce the labeling cost and enhance the sample efficiency, Active Learning (AL) technique…

Computation and Language · Computer Science 2024-01-17 Xuesong Wang