Related papers: A Practical Incremental Learning Framework For Spa…

Batch Active Learning from the Perspective of Sparse Approximation

Active learning enables efficient model training by leveraging interactions between machine learning agents and human annotators. We study and propose a novel framework that formulates batch active learning from the sparse approximation's…

Machine Learning · Computer Science 2022-11-08 Maohao Shen , Bowen Jiang , Jacky Yibo Zhang , Oluwasanmi Koyejo

Assisted Text Annotation Using Active Learning to Achieve High Quality with Little Effort

Large amounts of annotated data have become more important than ever, especially since the rise of deep learning techniques. However, manual annotations are costly. We propose a tool that enables researchers to create large, high-quality,…

Digital Libraries · Computer Science 2021-12-23 Franziska Weeber , Felix Hamborg , Karsten Donnay , Bela Gipp

ALE: A Simulation-Based Active Learning Evaluation Framework for the Parameter-Driven Comparison of Query Strategies for NLP

Supervised machine learning and deep learning require a large amount of labeled data, which data scientists obtain in a manual, and time-consuming annotation process. To mitigate this challenge, Active Learning (AL) proposes promising data…

Computation and Language · Computer Science 2023-08-08 Philipp Kohl , Nils Freyer , Yoka Krämer , Henri Werth , Steffen Wolf , Bodo Kraft , Matthias Meinecke , Albert Zündorf

Adaptive Latent Entity Expansion for Document Retrieval

Despite considerable progress in neural relevance ranking techniques, search engines still struggle to process complex queries effectively - both in terms of precision and recall. Sparse and dense Pseudo-Relevance Feedback (PRF) approaches…

Information Retrieval · Computer Science 2023-12-06 Iain Mackie , Shubham Chatterjee , Sean MacAvaney , Jeffrey Dalton

ESA: Annotation-Efficient Active Learning for Semantic Segmentation

Active learning enhances annotation efficiency by selecting the most revealing samples for labeling, thereby reducing reliance on extensive human input. Previous methods in semantic segmentation have centered on individual pixels or small…

Computer Vision and Pattern Recognition · Computer Science 2025-08-07 Jinchao Ge , Zeyu Zhang , Minh Hieu Phan , Bowen Zhang , Akide Liu , Yang Zhao , Shuwen Zhao

Improving Named Entity Recognition in Telephone Conversations via Effective Active Learning with Human in the Loop

Telephone transcription data can be very noisy due to speech recognition errors, disfluencies, etc. Not only that annotating such data is very challenging for the annotators, but also such data may have lots of annotation errors even after…

Computation and Language · Computer Science 2022-11-03 Md Tahmid Rahman Laskar , Cheng Chen , Xue-Yong Fu , Shashi Bhushan TN

Active Learning for Abstractive Text Summarization

Construction of human-curated annotated datasets for abstractive text summarization (ATS) is very time-consuming and expensive because creating each instance requires a human annotator to read a long document and compose a shorter summary…

Computation and Language · Computer Science 2023-01-10 Akim Tsvigun , Ivan Lysenko , Danila Sedashov , Ivan Lazichny , Eldar Damirov , Vladimir Karlov , Artemy Belousov , Leonid Sanochkin , Maxim Panov , Alexander Panchenko , Mikhail Burtsev , Artem Shelmanov

Early Stage Sparse Retrieval with Entity Linking

Despite the advantages of their low-resource settings, traditional sparse retrievers depend on exact matching approaches between high-dimensional bag-of-words (BoW) representations of both the queries and the collection. As a result,…

Information Retrieval · Computer Science 2022-08-11 Dahlia Shehata , Negar Arabzadeh , Charles L. A. Clarke

A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

Entity Matching (EM) is a core data cleaning task, aiming to identify different mentions of the same real-world entity. Active learning is one way to address the challenge of scarce labeled data in practice, by dynamically collecting the…

Databases · Computer Science 2020-03-31 Venkata Vamsikrishna Meduri , Lucian Popa , Prithviraj Sen , Mohamed Sarwat

Revisiting Sparse Retrieval for Few-shot Entity Linking

Entity linking aims to link ambiguous mentions to their corresponding entities in a knowledge base. One of the key challenges comes from insufficient labeled data for specific domains. Although dense retrievers have achieved excellent…

Computation and Language · Computer Science 2023-10-20 Yulin Chen , Zhenran Xu , Baotian Hu , Min Zhang

Active Learning with a Noisy Annotator

Active Learning (AL) aims to reduce annotation costs by strategically selecting the most informative samples for labeling. However, most active learning methods struggle in the low-budget regime where only a few labeled examples are…

Machine Learning · Computer Science 2025-04-08 Netta Shafir , Guy Hacohen , Daphna Weinshall

Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training

In this work we propose a pragmatic method that reduces the annotation cost for structured label spaces using active learning. Our approach leverages partial annotation, which reduces labeling costs for structured outputs by selecting only…

Computation and Language · Computer Science 2023-10-20 Zhisong Zhang , Emma Strubell , Eduard Hovy

Optimizing Active Learning for Low Annotation Budgets

When we can not assume a large amount of annotated data , active learning is a good strategy. It consists in learning a model on a small amount of annotated data (annotation budget) and in choosing the best set of points to annotate in…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Umang Aggarwal , Adrian Popescu , Céline Hudelot

SparseEval: Efficient Evaluation of Large Language Models by Sparse Optimization

As large language models (LLMs) continue to scale up, their performance on various downstream tasks has significantly improved. However, evaluating their capabilities has become increasingly expensive, as performing inference on a large…

Computation and Language · Computer Science 2026-02-10 Taolin Zhang , Hang Guo , Wang Lu , Tao Dai , Shu-Tao Xia , Jindong Wang

Enhancing Cost Efficiency in Active Learning with Candidate Set Query

This paper introduces a cost-efficient active learning (AL) framework for classification, featuring a novel query design called candidate set query. Unlike traditional AL queries requiring the oracle to examine all possible classes, our…

Machine Learning · Computer Science 2025-08-20 Yeho Gwon , Sehyun Hwang , Hoyoung Kim , Jungseul Ok , Suha Kwak

CHASe: Client Heterogeneity-Aware Data Selection for Effective Federated Active Learning

Active learning (AL) reduces human annotation costs for machine learning systems by strategically selecting the most informative unlabeled data for annotation, but performing it individually may still be insufficient due to restricted data…

Machine Learning · Computer Science 2025-04-25 Jun Zhang , Jue Wang , Huan Li , Zhongle Xie , Ke Chen , Lidan Shou

Learning a Cost-Effective Annotation Policy for Question Answering

State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive. For this reason, customizing QA systems is challenging. As a remedy, we propose a novel framework…

Computation and Language · Computer Science 2020-11-10 Bernhard Kratzwald , Stefan Feuerriegel , Huan Sun

Information Retrieval with Entity Linking

Despite the advantages of their low-resource settings, traditional sparse retrievers depend on exact matching approaches between high-dimensional bag-of-words (BoW) representations of both the queries and the collection. As a result,…

Information Retrieval · Computer Science 2024-04-16 Dahlia Shehata

Efficient Deep Representation Learning by Adaptive Latent Space Sampling

Supervised deep learning requires a large amount of training samples with annotations (e.g. label class for classification task, pixel- or voxel-wised label map for segmentation tasks), which are expensive and time-consuming to obtain.…

Computer Vision and Pattern Recognition · Computer Science 2020-04-14 Yuanhan Mo , Shuo Wang , Chengliang Dai , Rui Zhou , Zhongzhao Teng , Wenjia Bai , Yike Guo

Focusing on Potential Named Entities During Active Label Acquisition

Named entity recognition (NER) aims to identify mentions of named entities in an unstructured text and classify them into predefined named entity classes. While deep learning-based pre-trained language models help to achieve good predictive…

Computation and Language · Computer Science 2023-06-16 Ali Osman Berk Sapci , Oznur Tastan , Reyyan Yeniterzi