Related papers: Handling Missing Annotations in Supervised Learnin…

Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Supervised classification algorithms are used to solve a growing number of real-life problems around the globe. Their performance is strictly connected with the quality of labels used in training. Unfortunately, acquiring good-quality…

Machine Learning · Computer Science 2024-07-08 Daniel Kałuża , Andrzej Janusz , Dominik Ślęzak

Modeling Multiple Annotator Expertise in the Semi-Supervised Learning Scenario

Learning algorithms normally assume that there is at most one annotation or label per data point. However, in some scenarios, such as medical diagnosis and on-line collaboration,multiple annotations may be available. In either case,…

Machine Learning · Computer Science 2012-03-19 Yan Yan , Romer Rosales , Glenn Fung , Jennifer Dy

Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid

Annotating data is a time-consuming and costly task, but it is inherently required for supervised machine learning. Active Learning (AL) is an established method that minimizes human labeling effort by iteratively selecting the most…

Machine Learning · Computer Science 2025-06-05 Julius Gonsior , Tim Rieß , Anja Reusch , Claudio Hartmann , Maik Thiele , Wolfgang Lehner

Reducing Label Effort: Self-Supervised meets Active Learning

Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Javad Zolfaghari Bengar , Joost van de Weijer , Bartlomiej Twardowski , Bogdan Raducanu

LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation

Successfully training a deep neural network demands a huge corpus of labeled data. However, each label only provides limited information to learn from and collecting the requisite number of labels involves massive human effort. In this…

Computation and Language · Computer Science 2020-04-17 Dong-Ho Lee , Rahul Khanna , Bill Yuchen Lin , Jamin Chen , Seyeon Lee , Qinyuan Ye , Elizabeth Boschee , Leonardo Neves , Xiang Ren

On the Limitations of Simulating Active Learning

Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling. However, performing AL experiments with human annotations…

Machine Learning · Computer Science 2023-05-24 Katerina Margatina , Nikolaos Aletras

Data Shapley Valuation for Efficient Batch Active Learning

Annotating the right set of data amongst all available data points is a key challenge in many machine learning applications. Batch active learning is a popular approach to address this, in which batches of unlabeled data points are selected…

Machine Learning · Statistics 2021-04-20 Amirata Ghorbani , James Zou , Andre Esteva

PT4AL: Using Self-Supervised Pretext Tasks for Active Learning

Labeling a large set of data is expensive. Active learning aims to tackle this problem by asking to annotate only the most informative data from the unlabeled set. We propose a novel active learning approach that utilizes self-supervised…

Computer Vision and Pattern Recognition · Computer Science 2022-07-27 John Seon Keun Yi , Minseok Seo , Jongchan Park , Dong-Geol Choi

Human Activity Recognition on wrist-worn accelerometers using self-supervised neural networks

Measures of Activity of Daily Living (ADL) are an important indicator of overall health but difficult to measure in-clinic. Automated and accurate human activity recognition (HAR) using wrist-worn accelerometers enables practical and cost…

Machine Learning · Computer Science 2021-12-24 Niranjan Sridhar , Lance Myers

Semantic Segmentation with Active Semi-Supervised Learning

Using deep learning, we now have the ability to create exceptionally good semantic segmentation systems; however, collecting the prerequisite pixel-wise annotations for training images remains expensive and time-consuming. Therefore, it…

Computer Vision and Pattern Recognition · Computer Science 2022-10-19 Aneesh Rangnekar , Christopher Kanan , Matthew Hoffman

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Identifying breakdowns in ongoing dialogues helps to improve communication effectiveness. Most prior work on this topic relies on human annotated data and data augmentation to learn a classification model. While quality labeled dialogue…

Computation and Language · Computer Science 2022-04-20 Qian Lin , Hwee Tou Ng

Self-supervised Semi-supervised Learning for Data Labeling and Quality Evaluation

As the adoption of deep learning techniques in industrial applications grows with increasing speed and scale, successful deployment of deep learning models often hinges on the availability, volume, and quality of annotated data. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-11-23 Haoping Bai , Meng Cao , Ping Huang , Jiulong Shan

Semi-Supervised Audio Classification with Partially Labeled Data

Audio classification has seen great progress with the increasing availability of large-scale datasets. These large datasets, however, are often only partially labeled as collecting full annotations is a tedious and expensive process. This…

Sound · Computer Science 2021-11-29 Siddharth Gururani , Alexander Lerch

Reassessing Active Learning Adoption in Contemporary NLP: A Community Survey

Supervised learning relies on data annotation which usually is time-consuming and therefore expensive. A longstanding strategy to reduce annotation costs is active learning, an iterative process, in which a human annotates only data…

Computation and Language · Computer Science 2026-02-03 Julia Romberg , Christopher Schröder , Julius Gonsior , Katrin Tomanek , Fredrik Olsson

Revisiting Active Learning under (Human) Label Variation

Access to high-quality labeled data remains a limiting factor in applied supervised learning. While label variation (LV), i.e., differing labels for the same instance, is common, especially in natural language processing, annotation…

Computation and Language · Computer Science 2025-07-04 Cornelia Gruber , Helen Alber , Bernd Bischl , Göran Kauermann , Barbara Plank , Matthias Aßenmacher

Boosting Active Learning via Improving Test Performance

Central to active learning (AL) is what data should be selected for annotation. Existing works attempt to select highly uncertain or informative data for annotation. Nevertheless, it remains unclear how selected data impacts the test…

Machine Learning · Computer Science 2022-01-25 Tianyang Wang , Xingjian Li , Pengkun Yang , Guosheng Hu , Xiangrui Zeng , Siyu Huang , Cheng-Zhong Xu , Min Xu

A Unified Active Learning Framework for Annotating Graph Data with Application to Software Source Code Performance Prediction

Most machine learning and data analytics applications, including performance engineering in software systems, require a large number of annotations and labelled data, which might not be available in advance. Acquiring annotations often…

Software Engineering · Computer Science 2023-09-21 Peter Samoaa , Linus Aronsson , Antonio Longa , Philipp Leitner , Morteza Haghir Chehreghani

Labeling Where Adapting Fails: Cross-Domain Semantic Segmentation with Point Supervision via Active Selection

Training models dedicated to semantic segmentation requires a large amount of pixel-wise annotated data. Due to their costly nature, these annotations might not be available for the task at hand. To alleviate this problem, unsupervised…

Computer Vision and Pattern Recognition · Computer Science 2022-06-07 Fei Pan , Francois Rameau , Junsik Kim , In So Kweon

Boosting Gesture Recognition with an Automatic Gesture Annotation Framework

Training a real-time gesture recognition model heavily relies on annotated data. However, manual data annotation is costly and demands substantial human effort. In order to address this challenge, we propose a framework that can…

Computer Vision and Pattern Recognition · Computer Science 2024-10-08 Junxiao Shen , Xuhai Xu , Ran Tan , Amy Karlson , Evan Strasnick

Semi-Supervised Graph Imbalanced Regression

Data imbalance is easily found in annotated data when the observations of certain continuous label values are difficult to collect for regression tasks. When they come to molecule and polymer property predictions, the annotated graph…

Machine Learning · Computer Science 2023-05-23 Gang Liu , Tong Zhao , Eric Inae , Tengfei Luo , Meng Jiang