English
Related papers

Related papers: Learning Ambiguity from Crowd Sequential Annotatio…

200 papers

Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost. However, the annotation quality of annotators varies considerably, which imposes new challenges in learning a high-quality model from the…

Machine Learning · Computer Science 2021-06-15 Zhendong Chu , Jing Ma , Hongning Wang

Traditional supervised learning requires ground truth labels for the training data, whose collection can be difficult in many cases. Recently, crowdsourcing has established itself as an efficient labeling solution through resorting to…

Machine Learning · Computer Science 2021-07-13 Ye Shi , Shao-Yuan Li , Sheng-Jun Huang

This paper presents a generic Bayesian framework that enables any deep learning model to actively learn from targeted crowds. Our framework inherits from recent advances in Bayesian deep learning, and extends existing work by considering…

Machine Learning · Computer Science 2018-03-13 Jie Yang , Thomas Drake , Andreas Damianou , Yoelle Maarek

Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on…

Computation and Language · Computer Science 2022-09-21 Xiaolei Lu , Tommy W. S. Chow

Selecting an effective training signal for machine learning tasks is difficult: expert annotations are expensive, and crowd-sourced annotations may not be reliable. Recent work has demonstrated that learning from a distribution over labels…

Computation and Language · Computer Science 2025-04-23 Dustin Wright , Isabelle Augenstein

As the size of the datasets getting larger, accurately annotating such datasets is becoming more impractical due to the expensiveness on both time and economy. Therefore, crowd-sourcing has been widely adopted to alleviate the cost of…

Machine Learning · Computer Science 2024-02-21 Hansong Zhang , Shikun Li , Dan Zeng , Chenggang Yan , Shiming Ge

Cognitive computing systems require human labeled data for evaluation, and often for training. The standard practice used in gathering this data minimizes disagreement between annotators, and we have found this results in data that fails to…

Computation and Language · Computer Science 2018-09-27 Anca Dumitrache , Lora Aroyo , Chris Welty

As a means of human-based computation, crowdsourcing has been widely used to annotate large-scale unlabeled datasets. One of the obvious challenges is how to aggregate these possibly noisy labels provided by a set of heterogeneous…

Machine Learning · Computer Science 2020-10-20 Xuan Wei , Daniel Dajun Zeng , Junming Yin

Sequence labeling is a fundamental framework for various natural language processing problems. Its performance is largely influenced by the annotation quality and quantity in supervised learning scenarios, and obtaining ground truth labels…

Computation and Language · Computer Science 2020-04-17 Ouyu Lan , Xiao Huang , Bill Yuchen Lin , He Jiang , Liyuan Liu , Xiang Ren

Estimation of semantic similarity is crucial for a variety of natural language processing (NLP) tasks. In the absence of a general theory of semantic information, many papers rely on human annotators as the source of ground truth for…

Computation and Language · Computer Science 2021-09-27 Shaul Solomon , Adam Cohn , Hernan Rosenblum , Chezi Hershkovitz , Ivan P. Yamshchikov

Current methods for sequence tagging, a core task in NLP, are data hungry, which motivates the use of crowdsourcing as a cheap way to obtain labelled data. However, annotators are often unreliable and current aggregation methods cannot…

Computation and Language · Computer Science 2019-09-09 Edwin Simpson , Iryna Gurevych

Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of `truth inference', as individual workers cannot be wholly trusted to provide reliable…

Machine Learning · Computer Science 2019-02-26 Yuan Li , Benjamin I. P. Rubinstein , Trevor Cohn

Annotation quality and quantity positively affect the learning performance of sequence labeling, a vital task in Natural Language Processing. Hiring domain experts to annotate a corpus is very costly in terms of money and time.…

Human-Computer Interaction · Computer Science 2023-07-04 Nasim Sabetpour , Adithya Kulkarni , Sihong Xie , Qi Li

Labeling real-world datasets is time consuming but indispensable for supervised machine learning models. A common solution is to distribute the labeling task across a large number of non-expert workers via crowd-sourcing. Due to the varying…

Machine Learning · Computer Science 2020-11-16 Taraneh Younesian , Chi Hong , Amirmasoud Ghiassi , Robert Birke , Lydia Y. Chen

Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation…

Machine Learning · Computer Science 2023-08-22 Kosuke Yoshimura , Hisashi Kashima

Samples with ground truth labels may not always be available in numerous domains. While learning from crowdsourcing labels has been explored, existing models can still fail in the presence of sparse, unreliable, or diverging annotations.…

Machine Learning · Computer Science 2021-12-07 Mani Sotoodeh , Li Xiong , Joyce C. Ho

Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to…

Computation and Language · Computer Science 2022-09-21 Anca Dumitrache , Lora Aroyo , Chris Welty

Crowdsourcing is a relatively economic and efficient solution to collect annotations from the crowd through online platforms. Answers collected from workers with different expertise may be noisy and unreliable, and the quality of annotated…

Machine Learning · Computer Science 2020-01-08 Jingzheng Tu , Guoxian Yu , Jun Wang , Carlotta Domeniconi , Xiangliang Zhang

A popular approach for large scale data annotation tasks is crowdsourcing, wherein each data point is labeled by multiple noisy annotators. We consider the problem of inferring ground truth from noisy ordinal labels obtained from multiple…

Machine Learning · Statistics 2013-05-02 Balaji Lakshminarayanan , Yee Whye Teh

We propose a meta-learning method for learning from multiple noisy annotators. In many applications such as crowdsourcing services, labels for supervised learning are given by multiple annotators. Since the annotators have different skills…

Machine Learning · Computer Science 2025-06-13 Atsutoshi Kumagai , Tomoharu Iwata , Taishi Nishiyama , Yasutoshi Ida , Yasuhiro Fujiwara
‹ Prev 1 2 3 10 Next ›