Related papers: Learning Fast Matching Models from Weak Annotation…

Pre-Trained Vision-Language Models as Partial Annotators

Pre-trained vision-language models learn massive data to model unified representations of images and natural languages, which can be widely applied to downstream machine learning tasks. In addition to zero-shot inference, in order to better…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Qian-Wei Wang , Yuqiu Xie , Letian Zhang , Zimo Liu , Shu-Tao Xia

LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation

Successfully training a deep neural network demands a huge corpus of labeled data. However, each label only provides limited information to learn from and collecting the requisite number of labels involves massive human effort. In this…

Computation and Language · Computer Science 2020-04-17 Dong-Ho Lee , Rahul Khanna , Bill Yuchen Lin , Jamin Chen , Seyeon Lee , Qinyuan Ye , Elizabeth Boschee , Leonardo Neves , Xiang Ren

Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Supervised classification algorithms are used to solve a growing number of real-life problems around the globe. Their performance is strictly connected with the quality of labels used in training. Unfortunately, acquiring good-quality…

Machine Learning · Computer Science 2024-07-08 Daniel Kałuża , Andrzej Janusz , Dominik Ślęzak

Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots

We propose a method that can leverage unlabeled data to learn a matching model for response selection in retrieval-based chatbots. The method employs a sequence-to-sequence architecture (Seq2Seq) model as a weak annotator to judge the…

Computation and Language · Computer Science 2018-05-11 Yu Wu , Wei Wu , Zhoujun Li , Ming Zhou

Reducing Label Effort: Self-Supervised meets Active Learning

Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a…

Computer Vision and Pattern Recognition · Computer Science 2021-08-27 Javad Zolfaghari Bengar , Joost van de Weijer , Bartlomiej Twardowski , Bogdan Raducanu

Perceptual Quality-based Model Training under Annotator Label Uncertainty

Annotators exhibit disagreement during data labeling, which can be termed as annotator label uncertainty. Annotator label uncertainty manifests in variations of labeling quality. Training with a single low-quality annotation per sample…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Chen Zhou , Mohit Prabhushankar , Ghassan AlRegib

A Survey on Deep Learning with Noisy Labels: How to train your model when you cannot trust on the annotations?

Noisy Labels are commonly present in data sets automatically collected from the internet, mislabeled by non-specialist annotators, or even specialists in a challenging task, such as in the medical field. Although deep learning models have…

Machine Learning · Computer Science 2020-12-08 Filipe R. Cordeiro , Gustavo Carneiro

Multi-utility Learning: Structured-output Learning with Multiple Annotation-specific Loss Functions

Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a…

Computer Vision and Pattern Recognition · Computer Science 2014-06-24 Roman Shapovalov , Dmitry Vetrov , Anton Osokin , Pushmeet Kohli

NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction

Deep neural models for relation extraction tend to be less reliable when perfectly labeled data is limited, despite their success in label-sufficient scenarios. Instead of seeking more instance-level labels from human annotators, here we…

Computation and Language · Computer Science 2020-01-17 Wenxuan Zhou , Hongtao Lin , Bill Yuchen Lin , Ziqi Wang , Junyi Du , Leonardo Neves , Xiang Ren

Tuning Vision-Language Models with Candidate Labels by Prompt Alignment

Vision-language models (VLMs) can learn high-quality representations from a large-scale training dataset of image-text pairs. Prompt learning is a popular approach to fine-tuning VLM to adapt them to downstream tasks. Despite the satisfying…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Zhifang Zhang , Yuwei Niu , Xin Liu , Beibei Li

Machine Learning from Explanations

Acquiring and training on large-scale labeled data can be impractical due to cost constraints. Additionally, the use of small training datasets can result in considerable variability in model outcomes, overfitting, and learning of spurious…

Machine Learning · Computer Science 2025-07-08 Jiashu Tao , Reza Shokri

Multi-Label Learning from Single Positive Labels

Predicting all applicable labels for a given image is known as multi-label classification. Compared to the standard multi-class case (where each image has only one label), it is considerably more challenging to annotate training data for…

Computer Vision and Pattern Recognition · Computer Science 2021-10-25 Elijah Cole , Oisin Mac Aodha , Titouan Lorieul , Pietro Perona , Dan Morris , Nebojsa Jojic

A Unified Active Learning Framework for Annotating Graph Data with Application to Software Source Code Performance Prediction

Most machine learning and data analytics applications, including performance engineering in software systems, require a large number of annotations and labelled data, which might not be available in advance. Acquiring annotations often…

Software Engineering · Computer Science 2023-09-21 Peter Samoaa , Linus Aronsson , Antonio Longa , Philipp Leitner , Morteza Haghir Chehreghani

An Adaptive Supervision Framework for Active Learning in Object Detection

Active learning approaches in computer vision generally involve querying strong labels for data. However, previous works have shown that weak supervision can be effective in training models for vision tasks while greatly reducing annotation…

Computer Vision and Pattern Recognition · Computer Science 2019-10-16 Sai Vikas Desai , Akshay L Chandra , Wei Guo , Seishi Ninomiya , Vineeth N Balasubramanian

Adaptive Self-training for Few-shot Neural Sequence Labeling

Sequence labeling is an important technique employed for many Natural Language Processing (NLP) tasks, such as Named Entity Recognition (NER), slot tagging for dialog systems and semantic parsing. Large-scale pre-trained language models…

Computation and Language · Computer Science 2020-12-14 Yaqing Wang , Subhabrata Mukherjee , Haoda Chu , Yuancheng Tu , Ming Wu , Jing Gao , Ahmed Hassan Awadallah

Towards Computationally Feasible Deep Active Learning

Active learning (AL) is a prominent technique for reducing the annotation effort required for training machine learning models. Deep learning offers a solution for several essential obstacles to deploying AL in practice but introduces many…

Computation and Language · Computer Science 2022-05-10 Akim Tsvigun , Artem Shelmanov , Gleb Kuzmin , Leonid Sanochkin , Daniil Larionov , Gleb Gusev , Manvel Avetisian , Leonid Zhukov

Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages

For high-resource languages like English, text classification is a well-studied task. The performance of modern NLP models easily achieves an accuracy of more than 90% in many standard datasets for text classification in English (Xie et…

Computation and Language · Computer Science 2022-06-06 Dawei Zhu , Michael A. Hedderich , Fangzhou Zhai , David Ifeoluwa Adelani , Dietrich Klakow

Generalizable Error Modeling for Human Data Annotation: Evidence From an Industry-Scale Search Data Annotation Program

Machine learning (ML) and artificial intelligence (AI) systems rely heavily on human-annotated data for training and evaluation. A major challenge in this context is the occurrence of annotation errors, as their effects can degrade model…

Machine Learning · Computer Science 2024-09-27 Heinrich Peters , Alireza Hashemi , James Rae

Enhanced Sample Selection with Confidence Tracking: Identifying Correctly Labeled yet Hard-to-Learn Samples in Noisy Data

We propose a novel sample selection method for image classification in the presence of noisy labels. Existing methods typically consider small-loss samples as correctly labeled. However, some correctly labeled samples are inherently…

Computer Vision and Pattern Recognition · Computer Science 2025-04-25 Weiran Pan , Wei Wei , Feida Zhu , Yong Deng

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or…

Computation and Language · Computer Science 2023-09-26 Muberra Ozmen , Joseph Cotnareanu , Mark Coates