English
Related papers

Related papers: Modeling and mitigating human annotation errors to…

200 papers

High-quality human annotations are necessary to create effective machine learning systems for social media. Low-quality human annotations indirectly contribute to the creation of inaccurate or biased learning systems. We show that human…

Social and Information Networks · Computer Science 2019-07-18 Rahul Pandey , Carlos Castillo , Hemant Purohit

Training and deploying machine learning models relies on a large amount of human-annotated data. As human labeling becomes increasingly expensive and time-consuming, recent research has developed multiple strategies to speed up annotation…

Computation and Language · Computer Science 2025-01-28 Ekaterina Artemova , Akim Tsvigun , Dominik Schlechtweg , Natalia Fedorova , Konstantin Chernyshev , Sergei Tilga , Boris Obmoroshev

Human-in-the-loop (HITL) frameworks are increasingly recognized for their potential to improve annotation accuracy in emotion estimation systems by combining machine predictions with human expertise. This study focuses on integrating a…

Human-Computer Interaction · Computer Science 2025-06-10 Sahana Yadnakudige Subramanya , Ko Watanabe , Andreas Dengel , Shoya Ishimaru

This paper introduces a human-in-the-loop (HITL) data annotation pipeline to generate high-quality, large-scale speech datasets. The pipeline combines human and machine advantages to more quickly, accurately, and cost-effectively annotate…

Audio and Speech Processing · Electrical Eng. & Systems 2021-09-06 Mingkuan Liu , Chi Zhang , Hua Xing , Chao Feng , Monchu Chen , Judith Bishop , Grace Ngapo

Despite growing interest in using large language models (LLMs) to automate annotation, their effectiveness in complex, nuanced, and multi-dimensional labelling tasks remains relatively underexplored. This study focuses on annotation for the…

Information Retrieval · Computer Science 2025-07-02 Leila Tavakoli , Hamed Zamani

Automated text annotation is a compelling use case for generative large language models (LLMs) in social media research. Recent work suggests that LLMs can achieve strong performance on annotation tasks; however, these studies evaluate LLMs…

Computation and Language · Computer Science 2024-09-24 Nicholas Pangakis , Samuel Wolken

Machine learning (ML) and artificial intelligence (AI) systems rely heavily on human-annotated data for training and evaluation. A major challenge in this context is the occurrence of annotation errors, as their effects can degrade model…

Machine Learning · Computer Science 2024-09-27 Heinrich Peters , Alireza Hashemi , James Rae

Integrating human expertise into machine learning systems often reduces the role of experts to labeling oracles, a paradigm that limits the amount of information exchanged and fails to capture the nuances of human judgment. We address this…

Human-Computer Interaction · Computer Science 2026-02-18 Belén Martín-Urcelay , Yoonsang Lee , Matthieu R. Bloch , Christopher J. Rozell

Information systems increasingly leverage artificial intelligence (AI) and machine learning (ML) to generate value from vast amounts of data. However, ML models are imperfect and can generate incorrect classifications. Hence,…

Machine Learning · Computer Science 2023-07-10 Johannes Jakubik , Daniel Weber , Patrick Hemmer , Michael Vössing , Gerhard Satzger

In the context of text classification, the financial burden of annotation exercises for creating training data is a critical issue. Active learning techniques, particularly those rooted in uncertainty sampling, offer a cost-effective…

Computation and Language · Computer Science 2024-06-19 Hamidreza Rouzegar , Masoud Makrehchi

The growing demand for AI training data has transformed data annotation into a global industry, but traditional approaches relying on human annotators are often time-consuming, labor-intensive, and prone to inconsistent quality. We propose…

Human-Computer Interaction · Computer Science 2024-09-25 Yifan Wang , David Stevens , Pranay Shah , Wenwen Jiang , Miao Liu , Xu Chen , Robert Kuo , Na Li , Boying Gong , Daniel Lee , Jiabo Hu , Ning Zhang , Bob Kamma

We deal with the problem of localized in-video taxonomic human annotation in the video content moderation domain, where the goal is to identify video segments that violate granular policies, e.g., community guidelines on an online video…

Machine Learning · Computer Science 2022-10-19 Meghana Deodhar , Xiao Ma , Yixin Cai , Alex Koes , Alex Beutel , Jilin Chen

Modeling complex subjective tasks in Natural Language Processing, such as recognizing emotion and morality, is considerably challenging due to significant variation in human annotations. This variation often reflects reasonable differences…

Computation and Language · Computer Science 2025-11-12 Georgios Chochlakis , Peter Wu , Arjun Bedi , Marcus Ma , Kristina Lerman , Shrikanth Narayanan

Human-annotated preference data play an important role in aligning large language models (LLMs). In this paper, we study two connected questions: how to monitor the quality of human preference annotators and how to incentivize them to…

Machine Learning · Computer Science 2026-04-08 Shang Liu , Hanzhao Wang , Zhongyao Ma , Xiaocheng Li

Given a (machine learning) classifier and a collection of unlabeled data, how can we efficiently identify misclassification patterns presented in this dataset? To address this problem, we propose a human-machine collaborative framework that…

Machine Learning · Computer Science 2023-12-20 Bao Nguyen , Viet Anh Nguyen

Traditional image annotation tasks rely heavily on human effort for object selection and label assignment, making the process time-consuming and prone to decreased efficiency as annotators experience fatigue after extensive work. This paper…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 He Zhang , Xinyi Fu , John M. Carroll

Today, ground-truth generation uses data sets annotated by cloud-based annotation services. These services rely on human annotation, which can be prohibitively expensive. In this paper, we consider the problem of hybrid human-machine…

Machine Learning · Computer Science 2023-02-28 Hang Qiu , Krishna Chintalapudi , Ramesh Govindan

We propose a point cloud annotation framework that employs human-in-loop learning to enable the creation of large point cloud datasets with per-point annotations. Sparse labels from a human annotator are iteratively propagated to generate a…

Computer Vision and Pattern Recognition · Computer Science 2019-06-12 Siddhant Jain , Sowmya Munukutla , David Held

Recognizing information disorder is difficult because judgments about manipulation depend on cultural and linguistic context. Yet current Large Language Models (LLMs) often behave as monocultural, English-centric "black boxes," producing…

Computation and Language · Computer Science 2026-03-31 Maziar Kianimoghadam Jouneghani

Human annotation of training samples is expensive, laborious, and sometimes challenging, especially for Natural Language Processing (NLP) tasks. To reduce the labeling cost and enhance the sample efficiency, Active Learning (AL) technique…

Computation and Language · Computer Science 2024-01-17 Xuesong Wang
‹ Prev 1 2 3 10 Next ›