Related papers: Active Code Learning: Benchmarking Sample-Efficien…

Robust Active Learning: Sample-Efficient Training of Robust Deep Learning Models

Active learning is an established technique to reduce the labeling cost to build high-quality machine learning models. A core component of active learning is the acquisition function that determines which data should be selected to…

Machine Learning · Computer Science 2021-12-07 Yuejun Guo , Qiang Hu , Maxime Cordy , Mike Papadakis , Yves Le Traon

Active Learning Methods for Efficient Data Utilization and Model Performance Enhancement

In the era of data-driven intelligence, the paradox of data abundance and annotation scarcity has emerged as a critical bottleneck in the advancement of machine learning. This paper gives a detailed overview of Active Learning (AL), which…

Machine Learning · Computer Science 2025-11-27 Chiung-Yi Tseng , Junhao Song , Ziqian Bi , Tianyang Wang , Chia Xin Liang , Xinyuan Song , Ming Liu

Practical Obstacles to Deploying Active Learning

Active learning (AL) is a widely-used training strategy for maximizing predictive performance subject to a fixed annotation budget. In AL one iteratively selects training examples for annotation, often those for which the current model is…

Machine Learning · Computer Science 2019-11-05 David Lowell , Zachary C. Lipton , Byron C. Wallace

Active Learning: Problem Settings and Recent Developments

In supervised learning, acquiring labeled training data for a predictive model can be very costly, but acquiring a large amount of unlabeled data is often quite easy. Active learning is a method of obtaining predictive models with high…

Machine Learning · Computer Science 2020-12-17 Hideitsu Hino

Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks

Learning-based techniques, especially advanced pre-trained models for code have demonstrated capabilities in code understanding and generation, solving diverse software engineering (SE) tasks. Despite the promising results, current training…

Software Engineering · Computer Science 2025-02-07 Kyi Shin Khant , Hong Yi Lin , Patanamon Thongtanunam

A Benchmark for Active Learning of Variability-Intensive Systems

Behavioral models are the key enablers for behavioral analysis of Software Product Lines (SPL), including testing and model checking. Active model learning comes to the rescue when family behavioral models are non-existent or outdated. A…

Software Engineering · Computer Science 2022-03-11 Shaghayegh Tavassoli , Carlos Diego Nascimento Damasceno , Mohammad Reza Mousavi , Ramtin Khosravi

Active Learning Using Aggregated Acquisition Functions: Accuracy and Sustainability Analysis

Active learning (AL) is a machine learning (ML) approach that strategically selects the most informative samples for annotation during training, aiming to minimize annotation costs. This strategy not only reduces labeling expenses but also…

Machine Learning · Computer Science 2026-03-25 Cédric Jung , Shirin Salehi , Anke Schmeink

The Good, the Bad, and the Missing: Neural Code Generation for Machine Learning Tasks

Machine learning (ML) has been increasingly used in a variety of domains, while solving ML programming tasks poses unique challenges because of the fundamentally different nature and construction from general programming tasks, especially…

Software Engineering · Computer Science 2024-01-17 Jiho Shin , Moshi Wei , Junjie Wang , Lin Shi , Song Wang

Active Testing: Sample-Efficient Model Evaluation

We introduce a new framework for sample-efficient model evaluation that we call active testing. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of…

Machine Learning · Statistics 2021-06-15 Jannik Kossen , Sebastian Farquhar , Yarin Gal , Tom Rainforth

Active learning to optimise time-expensive algorithm selection

Hard optimisation problems such as Boolean Satisfiability typically have long solving times and can usually be solved by many algorithms, although the performance can vary widely in practice. Research has shown that no single algorithm…

Machine Learning · Computer Science 2019-09-10 Riccardo Volpato , Guangyan Song

Limitations of Assessing Active Learning Performance at Runtime

Classification algorithms aim to predict an unknown label (e.g., a quality class) for a new instance (e.g., a product). Therefore, training samples (instances and labels) are used to deduct classification hypotheses. Often, it is relatively…

Machine Learning · Computer Science 2019-01-30 Daniel Kottke , Jim Schellinger , Denis Huseljic , Bernhard Sick

Exploring Adversarial Examples for Efficient Active Learning in Machine Learning Classifiers

Machine learning researchers have long noticed the phenomenon that the model training process will be more effective and efficient when the training samples are densely sampled around the underlying decision boundary. While this observation…

Machine Learning · Computer Science 2021-09-24 Honggang Yu , Shihfeng Zeng , Teng Zhang , Ing-Chao Lin , Yier Jin

Optimizing Active Learning for Low Annotation Budgets

When we can not assume a large amount of annotated data , active learning is a good strategy. It consists in learning a model on a small amount of annotated data (annotation budget) and in choosing the best set of points to annotate in…

Computer Vision and Pattern Recognition · Computer Science 2022-01-19 Umang Aggarwal , Adrian Popescu , Céline Hudelot

Exploring the Design Space of Cognitive Engagement Techniques with AI-Generated Code for Enhanced Learning

Novice programmers are increasingly relying on Large Language Models (LLMs) to generate code for learning programming concepts. However, this interaction can lead to superficial engagement, giving learners an illusion of learning and…

Human-Computer Interaction · Computer Science 2024-10-14 Majeed Kazemitabaar , Oliver Huang , Sangho Suh , Austin Z. Henley , Tovi Grossman

Efficient Code LLM Training via Distribution-Consistent and Diversity-Aware Data Selection

Recent advancements in large language models (LLMs) have significantly improved code generation and program comprehension, accelerating the evolution of software engineering. Current methods primarily enhance model performance by leveraging…

Computation and Language · Computer Science 2025-07-04 Weijie Lyu , Sheng-Jun Huang , Xuan Xia

Active Feature Acquisition with Supervised Matrix Completion

Feature missing is a serious problem in many applications, which may lead to low quality of training data and further significantly degrade the learning performance. While feature acquisition usually involves special devices or complex…

Machine Learning · Computer Science 2018-06-06 Sheng-Jun Huang , Miao Xu , Ming-Kun Xie , Masashi Sugiyama , Gang Niu , Songcan Chen

An Empirical Study of Retrieval-Augmented Code Generation: Challenges and Opportunities

Code generation aims to automatically generate code snippets of specific programming language according to natural language descriptions. The continuous advancements in deep learning, particularly pre-trained models, have empowered the code…

Software Engineering · Computer Science 2025-01-24 Zezhou Yang , Sirong Chen , Cuiyun Gao , Zhenhao Li , Xing Hu , Kui Liu , Xin Xia

On the Evaluation Criterions for the Active Learning Processes

In many data mining applications collection of sufficiently large datasets is the most time consuming and expensive. On the other hand, industrial methods of data collection create huge databases, and make difficult direct applications of…

Machine Learning · Statistics 2011-08-03 Vladimir Nikulin

Hitting the Target: Stopping Active Learning at the Cost-Based Optimum

Active learning allows machine learning models to be trained using fewer labels while retaining similar performance to traditional supervised learning. An active learner selects the most informative data points, requests their labels, and…

Machine Learning · Computer Science 2023-11-22 Zac Pullar-Strecker , Katharina Dost , Eibe Frank , Jörg Wicker

Constraining the Parameters of High-Dimensional Models with Active Learning

Constraining the parameters of physical models with $>5-10$ parameters is a widespread problem in fields like particle physics and astronomy. The generation of data to explore this parameter space often requires large amounts of…

Machine Learning · Computer Science 2019-11-26 Sascha Caron , Tom Heskes , Sydney Otten , Bob Stienen