Related papers: Active Learning from Crowd in Document Screening

Active Learning for Crowd-Sourced Databases

Crowd-sourcing has become a popular means of acquiring labeled data for a wide variety of tasks where humans are more accurate than computers, e.g., labeling images, matching objects, or analyzing sentiment. However, relying solely on the…

Machine Learning · Computer Science 2014-12-23 Barzan Mozafari , Purnamrita Sarkar , Michael J. Franklin , Michael I. Jordan , Samuel Madden

Active Crowd Counting with Limited Supervision

To learn a reliable people counter from crowd images, head center annotations are normally required. Annotating head centers is however a laborious and tedious process in dense crowds. In this paper, we present an active learning framework…

Computer Vision and Pattern Recognition · Computer Science 2020-07-16 Zhen Zhao , Miaojing Shi , Xiaoxiao Zhao , Li Li

Combining Crowd and Machines for Multi-predicate Item Screening

This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that…

Information Retrieval · Computer Science 2019-04-02 Evgeny Krivosheev , Fabio Casati , Marcos Baez , Boualem Benatallah

Crowd-Machine Collaboration for Item Screening

In this paper we describe how crowd and machine classifier can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms…

Human-Computer Interaction · Computer Science 2018-03-22 Evgeny Krivosheev , Bahareh Harandizadeh , Fabio Casati , Boualem Benatallah

Crowd-based Multi-Predicate Screening of Papers in Literature Reviews

Systematic literature reviews (SLRs) are one of the most common and useful form of scientific research and publication. Tens of thousands of SLRs are published each year, and this rate is growing across all fields of science. Performing an…

Human-Computer Interaction · Computer Science 2018-03-28 Evgeny Krivosheev , Fabio Casati , Boualem Benatallah

Multi-Label Active Learning from Crowds

Multi-label active learning is a hot topic in reducing the label cost by optimally choosing the most valuable instance to query its label from an oracle. In this paper, we consider the poolbased multi-label active learning under the…

Machine Learning · Computer Science 2015-08-05 Shao-Yuan Li , Yuan Jiang , Zhi-Hua Zhou

Active Scene Learning

Sketch recognition allows natural and efficient interaction in pen-based interfaces. A key obstacle to building accurate sketch recognizers has been the difficulty of creating large amounts of annotated training data. Several authors have…

Computer Vision and Pattern Recognition · Computer Science 2019-03-08 Erelcan Yanik , Tevfik Metin Sezgin

An Active Learning Approach for Jointly Estimating Worker Performance and Annotation Reliability with Crowdsourced Data

Crowdsourcing platforms offer a practical solution to the problem of affordably annotating large datasets for training supervised classifiers. Unfortunately, poor worker performance frequently threatens to compromise annotation reliability,…

Machine Learning · Computer Science 2014-01-17 Liyue Zhao , Yu Zhang , Gita Sukthankar

An Active Learning Based Approach For Effective Video Annotation And Retrieval

Conventional multimedia annotation/retrieval systems such as Normalized Continuous Relevance Model (NormCRM) [16] require a fully labeled training data for a good performance. Active Learning, by determining an order for labeling the…

Multimedia · Computer Science 2015-04-28 Moitreya Chatterjee , Anton Leuski

Label Selection Approach to Learning from Crowds

Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation…

Machine Learning · Computer Science 2023-08-22 Kosuke Yoshimura , Hisashi Kashima

Active Multi-Label Crowd Consensus

Crowdsourcing is an economic and efficient strategy aimed at collecting annotations of data through an online platform. Crowd workers with different expertise are paid for their service, and the task requester usually has a limited budget.…

Machine Learning · Computer Science 2019-11-11 Jinzheng Tu , Guoxian Yu , Carlotta Domeniconi , Jun Wang , Xiangliang Zhang

Fooling the Crowd with Deep Learning-based Methods

Modern, state-of-the-art deep learning approaches yield human like performance in numerous object detection and classification tasks. The foundation for their success is the availability of training datasets of substantially high quantity,…

Human-Computer Interaction · Computer Science 2019-12-03 Christian Marzahl , Marc Aubreville , Christof A. Bertram , Stefan Gerlach , Jennifer Maier , Jörn Voigt , Jenny Hill , Robert Klopfleisch , Andreas Maier

A Data Management Approach for Dataset Selection Using Human Computation

As the number of applications that use machine learning algorithms increases, the need for labeled data useful for training such algorithms intensifies. Getting labels typically involves employing humans to do the annotation, which directly…

Machine Learning · Computer Science 2013-07-16 Alexandros Ntoulas , Omar Alonso , Vasilis Kandylas

Efficiency of active learning for the allocation of workers on crowdsourced classification tasks

Crowdsourcing has been successfully employed in the past as an effective and cheap way to execute classification tasks and has therefore attracted the attention of the research community. However, we still lack a theoretical understanding…

Human-Computer Interaction · Computer Science 2016-10-20 Edoardo Manino , Long Tran-Thanh , Nicholas R. Jennings

An Analysis of Active Learning Algorithms using Real-World Crowd-sourced Text Annotations

Active learning algorithms automatically identify the most informative samples from large amounts of unlabeled data and tremendously reduce human annotation effort in inducing a machine learning model. In a conventional active learning…

Machine Learning · Computer Science 2026-04-28 Varun Totakura , Ankita Singh , Yushun Dong , Shayok Chakraborty

Identifying Wrongly Predicted Samples: A Method for Active Learning

State-of-the-art machine learning models require access to significant amount of annotated data in order to achieve the desired level of performance. While unlabelled data can be largely available and even abundant, annotation process can…

Machine Learning · Computer Science 2020-10-15 Rahaf Aljundi , Nikolay Chumerin , Daniel Olmeda Reino

Aggregating Soft Labels from Crowd Annotations Improves Uncertainty Estimation Under Distribution Shift

Selecting an effective training signal for machine learning tasks is difficult: expert annotations are expensive, and crowd-sourced annotations may not be reliable. Recent work has demonstrated that learning from a distribution over labels…

Computation and Language · Computer Science 2025-04-23 Dustin Wright , Isabelle Augenstein

Active clustering for labeling training data

Gathering training data is a key step of any supervised learning task, and it is both critical and expensive. Critical, because the quantity and quality of the training data has a high impact on the performance of the learned function.…

Data Structures and Algorithms · Computer Science 2021-10-28 Quentin Lutz , Élie de Panafieu , Alex Scott , Maya Stein

Clean or Annotate: How to Spend a Limited Data Collection Budget

Crowdsourcing platforms are often used to collect datasets for training machine learning models, despite higher levels of inaccurate labeling compared to expert labeling. There are two common strategies to manage the impact of such noise.…

Computation and Language · Computer Science 2022-06-14 Derek Chen , Zhou Yu , Samuel R. Bowman

Compute-Efficient Active Learning

Active learning, a powerful paradigm in machine learning, aims at reducing labeling costs by selecting the most informative samples from an unlabeled dataset. However, the traditional active learning process often demands extensive…

Machine Learning · Computer Science 2024-01-17 Gábor Németh , Tamás Matuszka