Related papers: On Using Active Learning and Self-Training when Mi…

Practical Obstacles to Deploying Active Learning

Active learning (AL) is a widely-used training strategy for maximizing predictive performance subject to a fixed annotation budget. In AL one iteratively selects training examples for annotation, often those for which the current model is…

Machine Learning · Computer Science 2019-11-05 David Lowell , Zachary C. Lipton , Byron C. Wallace

Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning

Active learning (AL) aims to enable training high performance classifiers with low annotation cost by predicting which subset of unlabelled instances would be most beneficial to label. The importance of AL has motivated extensive research,…

Machine Learning · Computer Science 2018-06-14 Kunkun Pang , Mingzhi Dong , Yang Wu , Timothy Hospedales

Active Learning for Abstractive Text Summarization

Construction of human-curated annotated datasets for abstractive text summarization (ATS) is very time-consuming and expensive because creating each instance requires a human annotator to read a long document and compose a shorter summary…

Computation and Language · Computer Science 2023-01-10 Akim Tsvigun , Ivan Lysenko , Danila Sedashov , Ivan Lazichny , Eldar Damirov , Vladimir Karlov , Artemy Belousov , Leonid Sanochkin , Maxim Panov , Alexander Panchenko , Mikhail Burtsev , Artem Shelmanov

Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort

Large amounts of annotated data have become more important than ever, especially since the rise of deep learning techniques. However, manual annotations are costly. We propose a tool that enables researchers to create large, high-quality,…

Digital Libraries · Computer Science 2021-12-23 Franziska Weeber , Felix Hamborg , Karsten Donnay , Bela Gipp

Testing the Assumptions of Active Learning for Translation Tasks with Few Samples

Active learning (AL) is a training paradigm for selecting unlabeled samples for annotation to improve model performance on a test set, which is useful when only a limited number of samples can be annotated. These algorithms often work by…

Computation and Language · Computer Science 2026-04-13 Lorenzo Jaime Yu Flores , Cesare Spinoso di-Piano , Ori Ernst , David Ifeoluwa Adelani , Jackie Chi Kit Cheung

Multi-task Active Learning for Pre-trained Transformer-based Models

Multi-task learning, in which several tasks are jointly learned by a single model, allows NLP models to share information from multiple annotations and may facilitate better predictions when the tasks are inter-related. This technique,…

Computation and Language · Computer Science 2022-10-31 Guy Rotman , Roi Reichart

Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation

While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and…

Computation and Language · Computer Science 2021-04-06 Rishi Hazra , Parag Dutta , Shubham Gupta , Mohammed Abdul Qaathir , Ambedkar Dukkipati

Towards Computationally Feasible Deep Active Learning

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many…

Computation and Language · Computer Science 2022-05-10 Akim Tsvigun , Artem Shelmanov , Gleb Kuzmin , Leonid Sanochkin , Daniil Larionov , Gleb Gusev , Manvel Avetisian , Leonid Zhukov

Active Learning for NLP with Large Language Models

Human annotation of training samples is expensive, laborious, and sometimes challenging, especially for Natural Language Processing (NLP) tasks. To reduce the labeling cost and enhance the sample efficiency, Active Learning (AL) technique…

Computation and Language · Computer Science 2024-01-17 Xuesong Wang

Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data

Machine learning-based classifiers have been used for text classification, such as sentiment analysis, news classification, and toxic comment classification. However, supervised machine learning models often require large amounts of labeled…

Computation and Language · Computer Science 2025-05-06 Yejian Zhang , Shingo Takada

Active$^2$ Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation

While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and…

Machine Learning · Computer Science 2021-04-08 Rishi Hazra , Parag Dutta , Shubham Gupta , Mohammed Abdul Qaathir , Ambedkar Dukkipati

ALE: A Simulation-Based Active Learning Evaluation Framework for the Parameter-Driven Comparison of Query Strategies for NLP

Supervised machine learning and deep learning require a large amount of labeled data, which data scientists obtain in a manual, and time-consuming annotation process. To mitigate this challenge, Active Learning (AL) proposes promising data…

Computation and Language · Computer Science 2023-08-08 Philipp Kohl , Nils Freyer , Yoka Krämer , Henri Werth , Steffen Wolf , Bodo Kraft , Matthias Meinecke , Albert Zündorf

Active learning for medical code assignment

Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision-making. However, ML models require a large number of…

Machine Learning · Computer Science 2021-04-14 Martha Dais Ferreira , Michal Malyska , Nicola Sahar , Riccardo Miotto , Fernando Paulovich , Evangelos Milios

Active Learning for Natural Language Generation

The field of Natural Language Generation (NLG) suffers from a severe shortage of labeled data due to the extremely expensive and time-consuming process involved in manual annotation. A natural approach for coping with this problem is active…

Computation and Language · Computer Science 2023-10-18 Yotam Perlitz , Ariel Gera , Michal Shmueli-Scheuer , Dafna Sheinwald , Noam Slonim , Liat Ein-Dor

Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid

Annotating data is a time-consuming and costly task, but it is inherently required for supervised machine learning. Active Learning (AL) is an established method that minimizes human labeling effort by iteratively selecting the most…

Machine Learning · Computer Science 2025-06-05 Julius Gonsior , Tim Rieß , Anja Reusch , Claudio Hartmann , Maik Thiele , Wolfgang Lehner

When does Active Learning Work?

Active Learning (AL) methods seek to improve classifier performance when labels are expensive or scarce. We consider two central questions: Where does AL work? How much does it help? To address these questions, a comprehensive experimental…

Machine Learning · Statistics 2014-08-07 Lewis Evans , Niall M. Adams , Christoforos Anagnostopoulos

Instance-wise Supervision-level Optimization in Active Learning

Active learning (AL) is a label-efficient machine learning paradigm that focuses on selectively annotating high-value instances to maximize learning efficiency. Its effectiveness can be further enhanced by incorporating weak supervision,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-11 Shinnosuke Matsuo , Riku Togashi , Ryoma Bise , Seiichi Uchida , Masahiro Nomura

On the Limitations of Simulating Active Learning

Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling. However, performing AL experiments with human annotations…

Machine Learning · Computer Science 2023-05-24 Katerina Margatina , Nikolaos Aletras

An Active Learning Approach for Jointly Estimating Worker Performance and Annotation Reliability with Crowdsourced Data

Crowdsourcing platforms offer a practical solution to the problem of affordably annotating large datasets for training supervised classifiers. Unfortunately, poor worker performance frequently threatens to compromise annotation reliability,…

Machine Learning · Computer Science 2014-01-17 Liyue Zhao , Yu Zhang , Gita Sukthankar

Reducing Label Effort: Self-Supervised meets Active Learning

Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Javad Zolfaghari Bengar , Joost van de Weijer , Bartlomiej Twardowski , Bogdan Raducanu