Related papers: A Benchmark Generative Probabilistic Model for Wea…

Fidelity-Weighted Learning

Training deep neural networks requires many training samples, but in practice training labels are expensive to obtain and may be of varying quality, as some may be from trusted expert labelers while others might be from heuristics or other…

Machine Learning · Computer Science 2018-05-24 Mostafa Dehghani , Arash Mehrjou , Stephan Gouws , Jaap Kamps , Bernhard Schölkopf

Unlock the Power of Unlabeled Data in Language Driving Model

Recent Vision-based Large Language Models~(VisionLLMs) for autonomous driving have seen rapid advancements. However, such promotion is extremely dependent on large-scale high-quality annotated data, which is costly and labor-intensive. To…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Chaoqun Wang , Jie Yang , Xiaobin Hong , Ruimao Zhang

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…

Machine Learning · Computer Science 2022-02-09 Chidubem Arachie , Bert Huang

Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning…

Machine Learning · Computer Science 2019-03-05 Neal Jean , Sang Michael Xie , Stefano Ermon

Variational Sequential Labelers for Semi-Supervised Learning

We introduce a family of multitask variational methods for semi-supervised sequence labeling. Our model family consists of a latent-variable generative model and a discriminative labeler. The generative models use latent variables to define…

Computation and Language · Computer Science 2019-06-25 Mingda Chen , Qingming Tang , Karen Livescu , Kevin Gimpel

End-to-End Weak Supervision

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art…

Machine Learning · Computer Science 2021-12-01 Salva Rühling Cachay , Benedikt Boecking , Artur Dubrawski

Pseudo Strong Labels from Frame-Level Predictions for Weakly Supervised Sound Event Detection

Weakly Supervised Sound Event Detection (WSSED), which relies on audio tags without precise onset and offset times, has become prevalent due to the scarcity of strongly labeled data that includes exact temporal boundaries for events. This…

Audio and Speech Processing · Electrical Eng. & Systems 2025-01-08 Yuliang Zhang , Defeng , Huang , Roberto Togneri

Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling

Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive. Recent frameworks address this bottleneck with generative models that synthesize weak…

Computation and Language · Computer Science 2021-02-09 Ernie Chang , Vera Demberg , Alex Marin

Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition

Weakly supervised named entity recognition methods train label models to aggregate the token annotations of multiple noisy labeling functions (LFs) without seeing any manually annotated labels. To work well, the label model needs to…

Computation and Language · Computer Science 2022-06-08 Yinghao Li , Le Song , Chao Zhang

Self-supervised Speaker Recognition with Loss-gated Learning

In self-supervised learning for speaker recognition, pseudo labels are useful as the supervision signals. It is a known fact that a speaker recognition model doesn't always benefit from pseudo labels due to their unreliability. In this…

Audio and Speech Processing · Electrical Eng. & Systems 2022-07-15 Ruijie Tao , Kong Aik Lee , Rohan Kumar Das , Ville Hautamäki , Haizhou Li

Weakly Supervised Instance Segmentation by Learning Annotation Consistent Instances

Recent approaches for weakly supervised instance segmentations depend on two components: (i) a pseudo label generation model that provides instances which are consistent with a given annotation; and (ii) an instance segmentation model,…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Aditya Arun , C. V. Jawahar , M. Pawan Kumar

Weaker Than You Think: A Critical Look at Weakly Supervised Learning

Weakly supervised learning is a popular approach for training machine learning models in low-resource settings. Instead of requesting high-quality yet costly human annotations, it allows training models with noisy annotations obtained from…

Computation and Language · Computer Science 2023-09-19 Dawei Zhu , Xiaoyu Shen , Marius Mosbach , Andreas Stephan , Dietrich Klakow

Active Self-Semi-Supervised Learning for Few Labeled Samples

Training deep models with limited annotations poses a significant challenge when applied to diverse practical domains. Employing semi-supervised learning alongside the self-supervised model offers the potential to enhance label efficiency.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Ziting Wen , Oscar Pizarro , Stefan Williams

Large Margin Semi-supervised Structured Output Learning

In structured output learning, obtaining labelled data for real-world applications is usually costly, while unlabelled examples are available in abundance. Semi-supervised structured classification has been developed to handle large amounts…

Machine Learning · Computer Science 2013-11-12 P. Balamurugan , Shirish Shevade , Sundararajan Sellamanickam

Prediction-Constrained Training for Semi-Supervised Mixture and Topic Models

Supervisory signals have the potential to make low-dimensional data representations, like those learned by mixture and topic models, more interpretable and useful. We propose a framework for training latent variable models that explicitly…

Machine Learning · Statistics 2017-11-15 Michael C. Hughes , Leah Weiner , Gabriel Hope , Thomas H. McCoy , Roy H. Perlis , Erik B. Sudderth , Finale Doshi-Velez

Pseudo-Label Noise Suppression Techniques for Semi-Supervised Semantic Segmentation

Semi-supervised learning (SSL) can reduce the need for large labelled datasets by incorporating unlabelled data into the training. This is particularly interesting for semantic segmentation, where labelling data is very costly and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-20 Sebastian Scherer , Robin Schön , Rainer Lienhart

ProPML: Probability Partial Multi-label Learning

Partial Multi-label Learning (PML) is a type of weakly supervised learning where each training instance corresponds to a set of candidate labels, among which only some are true. In this paper, we introduce \our{}, a novel probabilistic…

Machine Learning · Computer Science 2024-03-13 Łukasz Struski , Adam Pardyl , Jacek Tabor , Bartosz Zieliński

Weakly Supervised Segmentation as Semantic-Based Regularization

Weakly supervised semantic segmentation (WSSS) trains dense pixel-level segmentation models from partial or coarse annotations such as bounding boxes, scribbles, or image-level tags. While recent work leverages foundation models such as the…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Stefano Colamonaco , Andrei-Bogdan Florea , Jaron Maene

Low Resource Pipeline for Spoken Language Understanding via Weak Supervision

In Weak Supervised Learning (WSL), a model is trained over noisy labels obtained from semantic rules and task-specific pre-trained models. Rules offer limited generalization over tasks and require significant manual efforts while…

Computation and Language · Computer Science 2022-06-22 Ayush Kumar , Rishabh Kumar Tripathi , Jithendra Vepa

Reliable Weakly Supervised Learning: Maximize Gain and Maintain Safeness

Weakly supervised data are widespread and have attracted much attention. However, since label quality is often difficult to guarantee, sometimes the use of weakly supervised data will lead to unsatisfactory performance, i.e., performance…

Machine Learning · Computer Science 2019-04-23 Lan-Zhe Guo , Yu-Feng Li , Ming Li , Jin-Feng Yi , Bo-Wen Zhou , Zhi-Hua Zhou