Related papers: Bayesian Methods for Semi-supervised Text Annotati…

Deep Bayesian Self-Training

Supervised Deep Learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of…

Computer Vision and Pattern Recognition · Computer Science 2019-07-18 Fabio De Sousa Ribeiro , Francesco Caliva , Mark Swainson , Kjartan Gudmundsson , Georgios Leontidis , Stefanos Kollias

Auto-Annotation Quality Prediction for Semi-Supervised Learning with Ensembles

Auto-annotation by ensemble of models is an efficient method of learning on unlabeled data. Wrong or inaccurate annotations generated by the ensemble may lead to performance degradation of the trained model. To deal with this problem we…

Computer Vision and Pattern Recognition · Computer Science 2024-03-14 Dror Simon , Miriam Farber , Roman Goldenberg

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that…

Computation and Language · Computer Science 2022-09-27 Jan-Christoph Klie , Bonnie Webber , Iryna Gurevych

Leveraging Annotator Disagreement for Text Classification

It is common practice in text classification to only use one majority label for model training even if a dataset has been annotated by multiple annotators. Doing so can remove valuable nuances and diverse perspectives inherent in the…

Computation and Language · Computer Science 2024-09-27 Jin Xu , Mariët Theune , Daniel Braun

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators' Disagreement

Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle…

Computation and Language · Computer Science 2022-10-17 Elisa Leonardelli , Stefano Menini , Alessio Palmero Aprosio , Marco Guerini , Sara Tonelli

Detecting discriminatory risk through data annotation based on Bayesian inferences

Thanks to the increasing growth of computational power and data availability, the research in machine learning has advanced with tremendous rapidity. Nowadays, the majority of automatic decision making systems are based on data. However, it…

Machine Learning · Computer Science 2021-01-28 Elena Beretta , Antonio Vetrò , Bruno Lepri , Juan Carlos De Martin

Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations

Majority voting and averaging are common approaches employed to resolve annotator disagreements and derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often…

Computation and Language · Computer Science 2021-10-13 Aida Mostafazadeh Davani , Mark Díaz , Vinodkumar Prabhakaran

Meta-learning Representations for Learning from Multiple Annotators

We propose a meta-learning method for learning from multiple noisy annotators. In many applications such as crowdsourcing services, labels for supervised learning are given by multiple annotators. Since the annotators have different skills…

Machine Learning · Computer Science 2025-06-13 Atsutoshi Kumagai , Tomoharu Iwata , Taishi Nishiyama , Yasutoshi Ida , Yasuhiro Fujiwara

Semi-Supervised and Unsupervised Sense Annotation via Translations

Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training…

Computation and Language · Computer Science 2021-09-21 Bradley Hauer , Grzegorz Kondrak , Yixing Luan , Arnob Mallik , Lili Mou

SemiMemes: A Semi-supervised Learning Approach for Multimodal Memes Analysis

The prevalence of memes on social media has created the need to sentiment analyze their underlying meanings for censoring harmful content. Meme censoring systems by machine learning raise the need for a semi-supervised learning solution to…

Machine Learning · Computer Science 2023-05-17 Pham Thai Hoang Tung , Nguyen Tan Viet , Ngo Tien Anh , Phan Duy Hung

Unsupervised Data Augmentation for Consistency Training

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model…

Machine Learning · Computer Science 2020-11-06 Qizhe Xie , Zihang Dai , Eduard Hovy , Minh-Thang Luong , Quoc V. Le

Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario

Learning algorithms normally assume that there is at most one annotation or label per data point. However, in some scenarios, such as medical diagnosis and on-line collaboration,multiple annotations may be available. In either case,…

Machine Learning · Computer Science 2012-03-19 Yan Yan , Romer Rosales , Glenn Fung , Jennifer Dy

A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks

Human annotations are vital to supervised learning, yet annotators often disagree on the correct label, especially as annotation tasks increase in complexity. A strategy to improve label quality is to ask multiple annotators to label the…

Machine Learning · Computer Science 2023-12-22 Alexander Braylan , Madalyn Marabella , Omar Alonso , Matthew Lease

Multi-Label Annotation Aggregation in Crowdsourcing

As a means of human-based computation, crowdsourcing has been widely used to annotate large-scale unlabeled datasets. One of the obvious challenges is how to aggregate these possibly noisy labels provided by a set of heterogeneous…

Machine Learning · Computer Science 2020-10-20 Xuan Wei , Daniel Dajun Zeng , Junming Yin

Learning from Imperfect Annotations

Many machine learning systems today are trained on large amounts of human-annotated data. Data annotation tasks that require a high level of competency make data acquisition expensive, while the resulting labels are often subjective,…

Machine Learning · Computer Science 2020-04-08 Emmanouil Antonios Platanios , Maruan Al-Shedivat , Eric Xing , Tom Mitchell

A Bayesian Data Augmentation Approach for Learning Deep Models

Data augmentation is an essential part of the training process applied to deep learning models. The motivation is that a robust training process for deep learning models depends on large annotated datasets, which are expensive to be…

Computer Vision and Pattern Recognition · Computer Science 2017-10-31 Toan Tran , Trung Pham , Gustavo Carneiro , Lyle Palmer , Ian Reid

Bayesian Semisupervised Learning with Deep Generative Models

Neural network based generative models with discriminative components are a powerful approach for semi-supervised learning. However, these techniques a) cannot account for model uncertainty in the estimation of the model's discriminative…

Machine Learning · Statistics 2017-06-30 Jonathan Gordon , José Miguel Hernández-Lobato

Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks

Supervised classification heavily depends on datasets annotated by humans. However, in subjective tasks such as toxicity classification, these annotations often exhibit low agreement among raters. Annotations have commonly been aggregated…

Computation and Language · Computer Science 2024-05-17 Negar Mokhberian , Myrl G. Marmarelis , Frederic R. Hopp , Valerio Basile , Fred Morstatter , Kristina Lerman

Learning from Multiple Expert Annotators for Enhancing Anomaly Detection in Medical Image Analysis

Building an accurate computer-aided diagnosis system based on data-driven approaches requires a large amount of high-quality labeled data. In medical imaging analysis, multiple expert annotators often produce subjective estimates about…

Computer Vision and Pattern Recognition · Computer Science 2022-03-22 Khiem H. Le , Tuan V. Tran , Hieu H. Pham , Hieu T. Nguyen , Tung T. Le , Ha Q. Nguyen

Pre-Trained Vision-Language Models as Partial Annotators

Pre-trained vision-language models learn massive data to model unified representations of images and natural languages, which can be widely applied to downstream machine learning tasks. In addition to zero-shot inference, in order to better…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Qian-Wei Wang , Yuqiu Xie , Letian Zhang , Zimo Liu , Shu-Tao Xia