Related papers: ARTICLE: Annotator Reliability Through In-Context …

A Unified Evaluation Framework for Multi-Annotator Tendency Learning

Recent works have emerged in multi-annotator learning that shift focus from Consensus-oriented Learning (CoL), which aggregates multiple annotations into a single ground-truth prediction, to Individual Tendency Learning (ITL), which models…

Machine Learning · Computer Science 2026-02-02 Liyun Zhang , Fengkai Liu , Xuanmeng Sha , Bowen Wang , Hong Liu , Zheng Lian

Bridging the Gap: In-Context Learning for Modeling Human Disagreement

Large Language Models (LLMs) have shown strong performance on NLP classification tasks. However, they typically rely on aggregated labels-often via majority voting-which can obscure the human disagreement inherent in subjective annotations.…

Computation and Language · Computer Science 2025-06-09 Benedetta Muscato , Yue Li , Gizem Gezici , Zhixue Zhao , Fosca Giannotti

Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process

In-context learning (ICL) is a few-shot learning paradigm that involves learning mappings through input-output pairs and appropriately applying them to new instances. Despite the remarkable ICL capabilities demonstrated by Large Language…

Computation and Language · Computer Science 2024-08-06 Peng Wang , Xiaobin Wang , Chao Lou , Shengyu Mao , Pengjun Xie , Yong Jiang

Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning

Paraphrasing of offensive content is a better alternative to content removal and helps improve civility in a communication environment. Supervised paraphrasers; however, rely heavily on large quantities of labelled data to help preserve…

Computation and Language · Computer Science 2024-06-11 Anirudh Som , Karan Sikka , Helen Gent , Ajay Divakaran , Andreas Kathol , Dimitra Vergyri

Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks

When constructing models that learn from noisy labels produced by multiple annotators, it is important to accurately estimate the reliability of annotators. Annotators may provide labels of inconsistent quality due to their varying…

Computation and Language · Computer Science 2019-05-14 Maolin Li , Arvid Fahlström Myrman , Tingting Mu , Sophia Ananiadou

Consistency is Key: Disentangling Label Variation in Natural Language Processing with Intra-Annotator Agreement

We commonly use agreement measures to assess the utility of judgements made by human annotators in Natural Language Processing (NLP) tasks. While inter-annotator agreement is frequently used as an indication of label reliability by…

Computation and Language · Computer Science 2025-10-21 Gavin Abercrombie , Tanvi Dinkar , Amanda Cercas Curry , Verena Rieser , Dirk Hovy

Improving In-Context Learning with Prediction Feedback for Sentiment Analysis

Large language models (LLMs) have achieved promising results in sentiment analysis through the in-context learning (ICL) paradigm. However, their ability to distinguish subtle sentiments still remains a challenge. Inspired by the human…

Computation and Language · Computer Science 2024-06-06 Hongling Xu , Qianlong Wang , Yice Zhang , Min Yang , Xi Zeng , Bing Qin , Ruifeng Xu

Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models

In-context learning (ICL) has transformed the use of large language models (LLMs) for NLP tasks, enabling few-shot learning by conditioning on labeled examples without finetuning. Despite its effectiveness, ICL is prone to errors,…

Computation and Language · Computer Science 2025-03-21 Mario Sanz-Guerrero , Katharina von der Wense

Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation

In the context of text classification, the financial burden of annotation exercises for creating training data is a critical issue. Active learning techniques, particularly those rooted in uncertainty sampling, offer a cost-effective…

Computation and Language · Computer Science 2024-06-19 Hamidreza Rouzegar , Masoud Makrehchi

Towards Consistent Detection of Cognitive Distortions: LLM-Based Annotation and Dataset-Agnostic Evaluation

Text-based automated Cognitive Distortion detection is a challenging task due to its subjective nature, with low agreement scores observed even among expert human annotators, leading to unreliable annotations. We explore the use of Large…

Computation and Language · Computer Science 2026-05-21 Neha Sharma , Navneet Agarwal , Kairit Sirts

Counting on Consensus: Selecting the Right Inter-annotator Agreement Metric for NLP Annotation and Evaluation

Human annotation remains the foundation of reliable and interpretable data in Natural Language Processing (NLP). As annotation and evaluation tasks continue to expand, from categorical labelling to segmentation, subjective judgment, and…

Computation and Language · Computer Science 2026-04-02 Joseph James

A Study on the Calibration of In-context Learning

Accurate uncertainty quantification is crucial for the safe deployment of machine learning models, and prior research has demonstrated improvements in the calibration of modern language models (LMs). We study in-context learning (ICL), a…

Computation and Language · Computer Science 2024-03-29 Hanlin Zhang , Yi-Fan Zhang , Yaodong Yu , Dhruv Madeka , Dean Foster , Eric Xing , Himabindu Lakkaraju , Sham Kakade

In-Context Learning and Fine-Tuning GPT for Argument Mining

Large Language Models (LLMs) have become ubiquitous in NLP and deep learning. In-Context Learning (ICL) has been suggested as a bridging paradigm between the training-free and fine-tuning LLMs settings. In ICL, an LLM is conditioned to…

Computation and Language · Computer Science 2024-06-12 Jérémie Cabessa , Hugo Hernault , Umer Mushtaq

Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement

Large Language Models (LLMs) have become essential for offensive language detection, yet their ability to handle annotation disagreement remains underexplored. Disagreement samples, which arise from subjective interpretations, pose a unique…

Computation and Language · Computer Science 2025-05-20 Junyu Lu , Kai Ma , Kaichun Wang , Kelaiti Xiao , Roy Ka-Wei Lee , Bo Xu , Liang Yang , Hongfei Lin

Test It Before You Trust It: Applying Software Testing for Trustworthy In-context Learning

In-context learning (ICL) has emerged as a powerful capability of large language models (LLMs), enabling them to perform new tasks based on a few provided examples without explicit fine-tuning. Despite their impressive adaptability, these…

Software Engineering · Computer Science 2025-09-09 Teeradaj Racharak , Chaiyong Ragkhitwetsagul , Chommakorn Sontesadisai , Thanwadee Sunetnanta

Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration

Large language models (LLMs) are increasingly positioned as scalable tools for annotating educational data, including classroom discourse, interaction logs, and qualitative learning artifacts. Their ability to rapidly summarize…

Artificial Intelligence · Computer Science 2026-03-17 Bakhtawar Ahtisham , Kirk Vanacore , Rene F. Kizilcec

In-Context Learning on a Budget: A Case Study in Token Classification

Few shot in-context learning (ICL) typically assumes access to large annotated training sets. However, in many real world scenarios, such as domain adaptation, there is only a limited budget to annotate a small number of samples, with the…

Computation and Language · Computer Science 2025-01-29 Uri Berger , Tal Baumel , Gabriel Stanovsky

Towards a Perspectivist Turn in Argument Quality Assessment

The assessment of argument quality depends on well-established logical, rhetorical, and dialectical properties that are unavoidably subjective: multiple valid assessments may exist, there is no unequivocal ground truth. This aligns with…

Computation and Language · Computer Science 2025-02-21 Julia Romberg , Maximilian Maurer , Henning Wachsmuth , Gabriella Lapesa

How Private is Your Attention? Bridging Privacy with In-Context Learning

In-context learning (ICL)-the ability of transformer-based models to perform new tasks from examples provided at inference time-has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms…

Machine Learning · Statistics 2025-04-23 Soham Bonnerjee , Zhen Wei , Yeon , Anna Asch , Sagnik Nandy , Promit Ghosal

Are Large Language Models Reliable Argument Quality Annotators?

Evaluating the quality of arguments is a crucial aspect of any system leveraging argument mining. However, it is a challenge to obtain reliable and consistent annotations regarding argument quality, as this usually requires domain-specific…

Computation and Language · Computer Science 2024-04-16 Nailia Mirzakhmedova , Marcel Gohsen , Chia Hao Chang , Benno Stein