Related papers: Improving Probabilistic Models in Text Classificat…

Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data

Machine learning-based classifiers have been used for text classification, such as sentiment analysis, news classification, and toxic comment classification. However, supervised machine learning models often require large amounts of labeled…

Computation and Language · Computer Science 2025-05-06 Yejian Zhang , Shingo Takada

Active Learning from Positive and Unlabeled Data

During recent years, active learning has evolved into a popular paradigm for utilizing user's feedback to improve accuracy of learning algorithms. Active learning works by selecting the most informative sample among unlabeled data and…

Machine Learning · Computer Science 2016-11-17 Alireza Ghasemi , Hamid R. Rabiee , Mohsen Fadaee , Mohammad T. Manzuri , Mohammad H. Rohban

Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models

Active learning is an iterative labeling process that is used to obtain a small labeled subset, despite the absence of labeled data, thereby enabling to train a model for supervised tasks such as text classification. While active learning…

Computation and Language · Computer Science 2024-10-07 Christopher Schröder , Gerhard Heyer

The Application of Active Query K-Means in Text Classification

Active learning is a state-of-art machine learning approach to deal with an abundance of unlabeled data. In the field of Natural Language Processing, typically it is costly and time-consuming to have all the data annotated. This…

Computation and Language · Computer Science 2021-07-19 Yukun Jiang

Active Learning Using Uncertainty Information

Many active learning methods belong to the retraining-based approaches, which select one unlabeled instance, add it to the training set with its possible labels, retrain the classification model, and evaluate the criteria that we base our…

Machine Learning · Statistics 2017-03-01 Yazhou Yang , Marco Loog

A cost-reducing partial labeling estimator in text classification problem

We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous…

Machine Learning · Statistics 2019-06-11 Jiangning Chen , Zhibo Dai , Juntao Duan , Qianli Hu , Ruilin Li , Heinrich Matzinger , Ionel Popescu , Haoyan Zhai

A Simple yet Brisk and Efficient Active Learning Platform for Text Classification

In this work, we propose the use of a fully managed machine learning service, which utilizes active learning to directly build models from unstructured data. With this tool, business users can quickly and easily build machine learning…

Machine Learning · Computer Science 2021-02-02 Teja Kanchinadam , Qian You , Keith Westpfahl , James Kim , Siva Gunda , Sebastian Seith , Glenn Fung

Frugal Reinforcement-based Active Learning

Most of the existing learning models, particularly deep neural networks, are reliant on large datasets whose hand-labeling is expensive and time demanding. A current trend is to make the learning of these models frugal and less dependent on…

Computer Vision and Pattern Recognition · Computer Science 2022-12-12 Sebastien Deschamps , Hichem Sahbi

Limitations of Assessing Active Learning Performance at Runtime

Classification algorithms aim to predict an unknown label (e.g., a quality class) for a new instance (e.g., a product). Therefore, training samples (instances and labels) are used to deduct classification hypotheses. Often, it is relatively…

Machine Learning · Computer Science 2019-01-30 Daniel Kottke , Jim Schellinger , Denis Huseljic , Bernhard Sick

Active Testing: Sample-Efficient Model Evaluation

We introduce a new framework for sample-efficient model evaluation that we call active testing. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of…

Machine Learning · Statistics 2021-06-15 Jannik Kossen , Sebastian Farquhar , Yarin Gal , Tom Rainforth

Active learning for reducing labeling effort in text classification tasks

Labeling data can be an expensive task as it is usually performed manually by domain experts. This is cumbersome for deep learning, as it is dependent on large labeled datasets. Active learning (AL) is a paradigm that aims to reduce…

Computation and Language · Computer Science 2021-11-05 Pieter Floris Jacobs , Gideon Maillette de Buy Wenniger , Marco Wiering , Lambert Schomaker

Beyond Labels: Information-Efficient Human-in-the-Loop Learning using Ranking and Selection Queries

Integrating human expertise into machine learning systems often reduces the role of experts to labeling oracles, a paradigm that limits the amount of information exchanged and fails to capture the nuances of human judgment. We address this…

Human-Computer Interaction · Computer Science 2026-02-18 Belén Martín-Urcelay , Yoonsang Lee , Matthieu R. Bloch , Christopher J. Rozell

Empirical Evaluations of Active Learning Strategies in Legal Document Review

One type of machine learning, text classification, is now regularly applied in the legal matters involving voluminous document populations because it can reduce the time and expense associated with the review of those documents. One form of…

Information Retrieval · Computer Science 2019-04-04 Rishi Chhatwal , Nathaniel Huber-Fliflet , Robert Keeling , Jianping Zhang , Haozhen Zhao

Active Robust Learning

In many practical applications of learning algorithms, unlabeled data is cheap and abundant whereas labeled data is expensive. Active learning algorithms developed to achieve better performance with lower cost. Usually Representativeness…

Machine Learning · Computer Science 2016-08-26 Hossein Ghafarian , Hadi Sadoghi Yazdi

On the Fragility of Active Learners for Text Classification

Active learning (AL) techniques optimally utilize a labeling budget by iteratively selecting instances that are most valuable for learning. However, they lack ``prerequisite checks'', i.e., there are no prescribed criteria to pick an AL…

Machine Learning · Computer Science 2024-10-07 Abhishek Ghose , Emma Thuong Nguyen

Cost-Accuracy Aware Adaptive Labeling for Active Learning

Conventional active learning algorithms assume a single labeler that produces noiseless label at a given, fixed cost, and aim to achieve the best generalization performance for given classifier under a budget constraint. However, in many…

Machine Learning · Computer Science 2021-05-25 Ruijiang Gao , Maytal Saar-tsechansky

Text Classification Using Label Names Only: A Language Model Self-Training Approach

Current text classification methods typically require a good number of human-labeled documents as training data, which can be costly and difficult to obtain in real applications. Humans can perform classification without seeing any labeled…

Computation and Language · Computer Science 2020-10-15 Yu Meng , Yunyi Zhang , Jiaxin Huang , Chenyan Xiong , Heng Ji , Chao Zhang , Jiawei Han

Cold-start Active Learning through Self-supervised Language Modeling

Active learning strives to reduce annotation costs by choosing the most critical examples to label. Typically, the active learning strategy is contingent on the classification model. For instance, uncertainty sampling depends on poorly…

Computation and Language · Computer Science 2020-10-26 Michelle Yuan , Hsuan-Tien Lin , Jordan Boyd-Graber

Compute-Efficient Active Learning

Active learning, a powerful paradigm in machine learning, aims at reducing labeling costs by selecting the most informative samples from an unlabeled dataset. However, the traditional active learning process often demands extensive…

Machine Learning · Computer Science 2024-01-17 Gábor Németh , Tamás Matuszka

Advancing Deep Active Learning & Data Subset Selection: Unifying Principles with Information-Theory Intuitions

At its core, this thesis aims to enhance the practicality of deep learning by improving the label and training efficiency of deep learning models. To this end, we investigate data subset selection techniques, specifically active learning…

Machine Learning · Computer Science 2024-03-11 Andreas Kirsch