English
Related papers

Related papers: Learning to Contextually Aggregate Multi-Source Su…

200 papers

Crowdsourcing provides a practical way to obtain large amounts of labeled data at a low cost. However, the annotation quality of annotators varies considerably, which imposes new challenges in learning a high-quality model from the…

Machine Learning · Computer Science 2021-06-15 Zhendong Chu , Jing Ma , Hongning Wang

Annotation quality and quantity positively affect the learning performance of sequence labeling, a vital task in Natural Language Processing. Hiring domain experts to annotate a corpus is very costly in terms of money and time.…

Human-Computer Interaction · Computer Science 2023-07-04 Nasim Sabetpour , Adithya Kulkarni , Sihong Xie , Qi Li

Many machine learning systems today are trained on large amounts of human-annotated data. Data annotation tasks that require a high level of competency make data acquisition expensive, while the resulting labels are often subjective,…

Machine Learning · Computer Science 2020-04-08 Emmanouil Antonios Platanios , Maruan Al-Shedivat , Eric Xing , Tom Mitchell

Selecting an effective training signal for machine learning tasks is difficult: expert annotations are expensive, and crowd-sourced annotations may not be reliable. Recent work has demonstrated that learning from a distribution over labels…

Computation and Language · Computer Science 2025-04-23 Dustin Wright , Isabelle Augenstein

Traditional supervised learning requires ground truth labels for the training data, whose collection can be difficult in many cases. Recently, crowdsourcing has established itself as an efficient labeling solution through resorting to…

Machine Learning · Computer Science 2021-07-13 Ye Shi , Shao-Yuan Li , Sheng-Jun Huang

Cloze-style reading comprehension has been a popular task for measuring the progress of natural language understanding in recent years. In this paper, we design a novel multi-perspective framework, which can be seen as the joint training of…

Computation and Language · Computer Science 2018-08-21 Liang Wang , Sujian Li , Wei Zhao , Kewei Shen , Meng Sun , Ruoyu Jia , Jingming Liu

Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers. Previous studies focus on reducing the influences from the noises of the…

Computation and Language · Computer Science 2021-11-16 Xin Zhang , Guangwei Xu , Yueheng Sun , Meishan Zhang , Pengjun Xie

Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation…

Machine Learning · Computer Science 2023-08-22 Kosuke Yoshimura , Hisashi Kashima

In this paper, we propose a novel framework that combines ensemble learning with augmented graph structures to improve the performance and robustness of semi-supervised node classification in graphs. By creating multiple augmented views of…

Machine Learning · Computer Science 2025-03-25 Maryam Abdolali , Romina Zakerian , Behnam Roshanfekr , Fardin Ayar , Mohammad Rahmati

Most crowdsourcing learning methods treat disagreement between annotators as noisy labelings while inter-disagreement among experts is often a good indicator for the ambiguity and uncertainty that is inherent in natural language. In this…

Computation and Language · Computer Science 2023-01-05 Xiaolei Lu

Supervised machine learning assumes that labeled data provide accurate measurements of the concepts models are meant to learn. Yet in practice, human labeling introduces systematic variation arising from ambiguous items, divergent…

Methodology · Statistics 2026-04-10 Robert Chew , Stephanie Eckman , Christoph Kern , Frauke Kreuter

As the size of the datasets getting larger, accurately annotating such datasets is becoming more impractical due to the expensiveness on both time and economy. Therefore, crowd-sourcing has been widely adopted to alleviate the cost of…

Machine Learning · Computer Science 2024-02-21 Hansong Zhang , Shikun Li , Dan Zeng , Chenggang Yan , Shiming Ge

Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on…

Computation and Language · Computer Science 2022-09-21 Xiaolei Lu , Tommy W. S. Chow

We propose a meta-learning method for learning from multiple noisy annotators. In many applications such as crowdsourcing services, labels for supervised learning are given by multiple annotators. Since the annotators have different skills…

Machine Learning · Computer Science 2025-06-13 Atsutoshi Kumagai , Tomoharu Iwata , Taishi Nishiyama , Yasutoshi Ida , Yasuhiro Fujiwara

Researchers have raised awareness about the harms of aggregating labels especially in subjective tasks that naturally contain disagreements among human annotators. In this work we show that models that are only provided aggregated labels…

Computation and Language · Computer Science 2024-03-08 Abhishek Anand , Negar Mokhberian , Prathyusha Naresh Kumar , Anweasha Saha , Zihao He , Ashwin Rao , Fred Morstatter , Kristina Lerman

As a means of human-based computation, crowdsourcing has been widely used to annotate large-scale unlabeled datasets. One of the obvious challenges is how to aggregate these possibly noisy labels provided by a set of heterogeneous…

Machine Learning · Computer Science 2020-10-20 Xuan Wei , Daniel Dajun Zeng , Junming Yin

Human annotations are vital to supervised learning, yet annotators often disagree on the correct label, especially as annotation tasks increase in complexity. A strategy to improve label quality is to ask multiple annotators to label the…

Machine Learning · Computer Science 2023-12-22 Alexander Braylan , Madalyn Marabella , Omar Alonso , Matthew Lease

Relying on crowdsourced workers, data crowdsourcing platforms are able to efficiently provide vast amounts of labeled data. Due to the variability in the annotation quality of crowd workers, modern techniques resort to redundant annotations…

Human-Computer Interaction · Computer Science 2023-11-28 Haoyu Liu , Fei Wang , Minmin Lin , Runze Wu , Renyu Zhu , Shiwei Zhao , Kai Wang , Tangjie Lv , Changjie Fan

Samples with ground truth labels may not always be available in numerous domains. While learning from crowdsourcing labels has been explored, existing models can still fail in the presence of sparse, unreliable, or diverging annotations.…

Machine Learning · Computer Science 2021-12-07 Mani Sotoodeh , Li Xiong , Joyce C. Ho

Multi-source domain adaptation aims at leveraging the knowledge from multiple tasks for predicting a related target domain. Hence, a crucial aspect is to properly combine different sources based on their relations. In this paper, we…

Machine Learning · Computer Science 2021-06-16 Changjian Shui , Zijian Li , Jiaqi Li , Christian Gagné , Charles Ling , Boyu Wang
‹ Prev 1 2 3 10 Next ›