Related papers: Efficient PAC Learning from the Crowd

Crowdsourced PAC Learning under Classification Noise

In this paper, we analyze PAC learnability from labels produced by crowdsourcing. In our setting, unlabeled examples are drawn from a distribution and labels are crowdsourced from workers who operate under classification noise, each with…

Machine Learning · Computer Science 2019-02-14 Shelby Heinecke , Lev Reyzin

Efficient PAC Learning from the Crowd with Pairwise Comparisons

We study crowdsourced PAC learning of threshold functions, where the labels are gathered from a pool of annotators some of whom may behave adversarially. This is yet a challenging problem and until recently has computationally and query…

Machine Learning · Computer Science 2022-12-07 Shiwei Zeng , Jie Shen

Semi-verified PAC Learning from the Crowd

We study the problem of crowdsourced PAC learning of threshold functions. This is a challenging problem and only recently have query-efficient algorithms been established under the assumption that a noticeable fraction of the workers are…

Machine Learning · Computer Science 2023-05-22 Shiwei Zeng , Jie Shen

Label Selection Approach to Learning from Crowds

Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation…

Machine Learning · Computer Science 2023-08-22 Kosuke Yoshimura , Hisashi Kashima

Crowd Labeling: a survey

Recently, there has been a burst in the number of research projects on human computation via crowdsourcing. Multiple choice (or labeling) questions could be referred to as a common type of problem which is solved by this approach. As an…

Artificial Intelligence · Computer Science 2014-09-04 Jafar Muhammadi , Hamid Reza Rabiee , Abbas Hosseini

Active Learning for Crowd-Sourced Databases

Crowd-sourcing has become a popular means of acquiring labeled data for a wide variety of tasks where humans are more accurate than computers, e.g., labeling images, matching objects, or analyzing sentiment. However, relying solely on the…

Machine Learning · Computer Science 2014-12-23 Barzan Mozafari , Purnamrita Sarkar , Michael J. Franklin , Michael I. Jordan , Samuel Madden

Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons

Crowdsourcing has become a popular method for collecting labeled training data. However, in many practical scenarios traditional labeling can be difficult for crowdworkers (for example, if the data is high-dimensional or unintuitive, or the…

Machine Learning · Statistics 2017-12-14 Tom Hope , Dafna Shahaf

Efficiency of active learning for the allocation of workers on crowdsourced classification tasks

Crowdsourcing has been successfully employed in the past as an effective and cheap way to execute classification tasks and has therefore attracted the attention of the research community. However, we still lack a theoretical understanding…

Human-Computer Interaction · Computer Science 2016-10-20 Edoardo Manino , Long Tran-Thanh , Nicholas R. Jennings

Optimizing the Wisdom of the Crowd: Inference, Learning, and Teaching

The unprecedented demand for large amount of data has catalyzed the trend of combining human insights with machine learning techniques, which facilitate the use of crowdsourcing to enlist label information both effectively and efficiently.…

Machine Learning · Statistics 2018-06-26 Yao Zhou , Jingrui He

Candidate Labeling for Crowd Learning

Crowdsourcing has become very popular among the machine learning community as a way to obtain labels that allow a ground truth to be estimated for a given dataset. In most of the approaches that use crowdsourced labels, annotators are asked…

Machine Learning · Statistics 2018-08-09 Iker Beñaran-Muñoz , Jerónimo Hernández-González , Aritz Pérez

Representation Learning from Limited Educational Data with Crowdsourced Labels

Representation learning has been proven to play an important role in the unprecedented success of machine learning models in numerous tasks, such as machine translation, face recognition and recommendation. The majority of existing…

Machine Learning · Computer Science 2020-09-24 Wentao Wang , Guowei Xu , Wenbiao Ding , Gale Yan Huang , Guoliang Li , Jiliang Tang , Zitao Liu

Practice of Efficient Data Collection via Crowdsourcing at Large-Scale

Modern machine learning algorithms need large datasets to be trained. Crowdsourcing has become a popular approach to label large datasets in a shorter time as well as at a lower cost comparing to that needed for a limited number of experts.…

Human-Computer Interaction · Computer Science 2019-12-11 Alexey Drutsa , Viktoriya Farafonova , Valentina Fedorova , Olga Megorskaya , Evfrosiniya Zerminova , Olga Zhilinskaya

Learning Effective Embeddings From Crowdsourced Labels: An Educational Case Study

Learning representation has been proven to be helpful in numerous machine learning tasks. The success of the majority of existing representation learning approaches often requires a large amount of consistent and noise-free labels. However,…

Human-Computer Interaction · Computer Science 2019-08-02 Guowei Xu , Wenbiao Ding , Jiliang Tang , Songfan Yang , Gale Yan Huang , Zitao Liu

Deep learning from crowds

Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so…

Machine Learning · Statistics 2017-12-27 Filipe Rodrigues , Francisco Pereira

Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective

One of the primary catalysts fueling advances in artificial intelligence (AI) and machine learning (ML) is the availability of massive, curated datasets. A commonly used technique to curate such massive datasets is crowdsourcing, where data…

Signal Processing · Electrical Eng. & Systems 2025-07-04 Shahana Ibrahim , Panagiotis A. Traganitis , Xiao Fu , Georgios B. Giannakis

crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels

Crowdworking is a cost-efficient solution for acquiring class labels. Since these labels are subject to noise, various approaches to learning from crowds have been proposed. Typically, these approaches are evaluated with default…

Machine Learning · Computer Science 2025-07-18 Marek Herde , Lukas Lührs , Denis Huseljic , Bernhard Sick

Learning From Noisy Singly-labeled Data

Supervised learning depends on annotated examples, which are taken to be the \emph{ground truth}. But these labels often come from noisy crowdsourcing platforms, like Amazon Mechanical Turk. Practitioners typically collect multiple labels…

Machine Learning · Computer Science 2018-05-22 Ashish Khetan , Zachary C. Lipton , Anima Anandkumar

End-to-End Learning from Noisy Crowd to Supervised Machine Learning Models

Labeling real-world datasets is time consuming but indispensable for supervised machine learning models. A common solution is to distribute the labeling task across a large number of non-expert workers via crowd-sourcing. Due to the varying…

Machine Learning · Computer Science 2020-11-16 Taraneh Younesian , Chi Hong , Amirmasoud Ghiassi , Robert Birke , Lydia Y. Chen

A Data Management Approach for Dataset Selection Using Human Computation

As the number of applications that use machine learning algorithms increases, the need for labeled data useful for training such algorithms intensifies. Getting labels typically involves employing humans to do the annotation, which directly…

Machine Learning · Computer Science 2013-07-16 Alexandros Ntoulas , Omar Alonso , Vasilis Kandylas

Crowd-Certain: Label Aggregation in Crowdsourced and Ensemble Learning Classification

Crowdsourcing systems have been used to accumulate massive amounts of labeled data for applications such as computer vision and natural language processing. However, because crowdsourced labeling is inherently dynamic and uncertain,…

Machine Learning · Computer Science 2023-10-26 Mohammad S. Majdi , Jeffrey J. Rodriguez