Related papers: REAL: A Representative Error-Driven Approach for A…

Towards Computationally Feasible Deep Active Learning

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many…

Computation and Language · Computer Science 2022-05-10 Akim Tsvigun , Artem Shelmanov , Gleb Kuzmin , Leonid Sanochkin , Daniil Larionov , Gleb Gusev , Manvel Avetisian , Leonid Zhukov

Navigating the Pitfalls of Active Learning Evaluation: A Systematic Framework for Meaningful Performance Assessment

Active Learning (AL) aims to reduce the labeling burden by interactively selecting the most informative samples from a pool of unlabeled data. While there has been extensive research on improving AL query methods in recent years, some…

Computer Vision and Pattern Recognition · Computer Science 2023-11-06 Carsten T. Lüth , Till J. Bungert , Lukas Klein , Paul F. Jaeger

Pool-Based Sequential Active Learning for Regression

Active learning is a machine learning approach for reducing the data labeling effort. Given a pool of unlabeled samples, it tries to select the most useful ones to label so that a model built from them can achieve the best possible…

Machine Learning · Computer Science 2020-03-31 Dongrui Wu

Targeting Optimal Active Learning via Example Quality

In many classification problems unlabelled data is abundant and a subset can be chosen for labelling. This defines the context of active learning (AL), where methods systematically select that subset, to improve a classifier by retraining.…

Machine Learning · Statistics 2014-07-31 Lewis P. G. Evans , Niall M. Adams , Christoforos Anagnostopoulos

Exemplar Guided Active Learning

We consider the problem of wisely using a limited budget to label a small subset of a large unlabeled dataset. We are motivated by the NLP problem of word sense disambiguation. For any word, we have a set of candidate labels from a…

Machine Learning · Computer Science 2020-11-04 Jason Hartford , Kevin Leyton-Brown , Hadas Raviv , Dan Padnos , Shahar Lev , Barak Lenz

Identifying Wrongly Predicted Samples: A Method for Active Learning

State-of-the-art machine learning models require access to significant amount of annotated data in order to achieve the desired level of performance. While unlabelled data can be largely available and even abundant, annotation process can…

Machine Learning · Computer Science 2020-10-15 Rahaf Aljundi , Nikolay Chumerin , Daniel Olmeda Reino

REALITrees: Rashomon Ensemble Active Learning for Interpretable Trees

Active learning reduces labeling costs by selecting samples that maximize information gain. A dominant framework, Query-by-Committee (QBC), typically relies on perturbation-based diversity by inducing model disagreement through random…

Machine Learning · Statistics 2026-03-25 Simon D. Nguyen , Hayden McTavish , Kentaro Hoffman , Cynthia Rudin , Tyler H. McCormick

Active Deep Learning on Entity Resolution by Risk Sampling

While the state-of-the-art performance on entity resolution (ER) has been achieved by deep learning, its effectiveness depends on large quantities of accurately labeled training data. To alleviate the data labeling burden, Active Learning…

Machine Learning · Computer Science 2020-12-25 Youcef Nafa , Qun Chen , Zhaoqiang Chen , Xingyu Lu , Haiyang He , Tianyi Duan , Zhanhuai Li

Deep Active Learning with Contrastive Learning Under Realistic Data Pool Assumptions

Active learning aims to identify the most informative data from an unlabeled data pool that enables a model to reach the desired accuracy rapidly. This benefits especially deep neural networks which generally require a huge number of…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Jihyo Kim , Jeonghyeon Kim , Sangheum Hwang

Unsupervised Pool-Based Active Learning for Linear Regression

In many real-world machine learning applications, unlabeled data can be easily obtained, but it is very time-consuming and/or expensive to label them. So, it is desirable to be able to select the optimal samples to label, so that a good…

Machine Learning · Computer Science 2020-01-16 Ziang Liu , Dongrui Wu

On the Limitations of Simulating Active Learning

Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling. However, performing AL experiments with human annotations…

Machine Learning · Computer Science 2023-05-24 Katerina Margatina , Nikolaos Aletras

CEREAL: Few-Sample Clustering Evaluation

Evaluating clustering quality with reliable evaluation metrics like normalized mutual information (NMI) requires labeled data that can be expensive to annotate. We focus on the underexplored problem of estimating clustering quality with…

Machine Learning · Computer Science 2022-10-04 Nihal V. Nayak , Ethan R. Elenberg , Clemens Rosenbaum

Pool-Based Unsupervised Active Learning for Regression Using Iterative Representativeness-Diversity Maximization (iRDM)

Active learning (AL) selects the most beneficial unlabeled samples to label, and hence a better machine learning model can be trained from the same number of labeled samples. Most existing active learning for regression (ALR) approaches are…

Machine Learning · Computer Science 2022-11-15 Ziang Liu , Xue Jiang , Hanbin Luo , Weili Fang , Jiajing Liu , Dongrui Wu

FAL-CUR: Fair Active Learning using Uncertainty and Representativeness on Fair Clustering

Active Learning (AL) techniques have proven to be highly effective in reducing data labeling costs across a range of machine learning tasks. Nevertheless, one known challenge of these methods is their potential to introduce unfairness…

Machine Learning · Computer Science 2023-12-20 Ricky Fajri , Akrati Saxena , Yulong Pei , Mykola Pechenizkiy

Optimal Labeler Assignment and Sampling for Active Learning in the Presence of Imperfect Labels

Active Learning (AL) has garnered significant interest across various application domains where labeling training data is costly. AL provides a framework that helps practitioners query informative samples for annotation by oracles…

Machine Learning · Computer Science 2025-12-16 Pouya Ahadi , Blair Winograd , Camille Zaug , Karunesh Arora , Lijun Wang , Kamran Paynabar

DEAL: Deep Evidential Active Learning for Image Classification

Convolutional Neural Networks (CNNs) have proven to be state-of-the-art models for supervised computer vision tasks, such as image classification. However, large labeled data sets are generally needed for the training and validation of such…

Machine Learning · Computer Science 2020-10-28 Patrick Hemmer , Niklas Kühl , Jakob Schöffer

An Efficient Active Learning Pipeline for Legal Text Classification

Active Learning (AL) is a powerful tool for learning with less labeled data, in particular, for specialized domains, like legal documents, where unlabeled data is abundant, but the annotation requires domain expertise and is thus expensive.…

Computation and Language · Computer Science 2022-11-16 Sepideh Mamooler , Rémi Lebret , Stéphane Massonnet , Karl Aberer

Reducing Confusion in Active Learning for Part-Of-Speech Tagging

Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. This is now an essential tool for building low-resource syntactic analyzers such as part-of-speech (POS) taggers. Existing…

Computation and Language · Computer Science 2020-11-24 Aditi Chaudhary , Antonios Anastasopoulos , Zaid Sheikh , Graham Neubig

Integrating Informativeness, Representativeness and Diversity in Pool-Based Sequential Active Learning for Regression

In many real-world machine learning applications, unlabeled samples are easy to obtain, but it is expensive and/or time-consuming to label them. Active learning is a common approach for reducing this data labeling effort. It optimally…

Machine Learning · Computer Science 2022-11-15 Ziang Liu , Dongrui Wu

Active Discriminative Text Representation Learning

We propose a new active learning (AL) method for text classification with convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled with the aim of maximizing model performance with minimal effort. Neural…

Computation and Language · Computer Science 2016-12-02 Ye Zhang , Matthew Lease , Byron C. Wallace