Related papers: Learning Hyper Label Model for Programmatic Weak S…

Weakly Supervised Label Learning Flows

Supervised learning usually requires a large amount of labelled data. However, attaining ground-truth labels is costly for many tasks. Alternatively, weakly supervised methods learn with cheap weak signals that only approximately label some…

Machine Learning · Computer Science 2024-11-26 You Lu , Wenzhuo Song , Chidubem Arachie , Bert Huang

A Survey on Programmatic Weak Supervision

Labeling training data has become one of the major roadblocks to using machine learning. Among various weak supervision paradigms, programmatic weak supervision (PWS) has achieved remarkable success in easing the manual labeling bottleneck…

Machine Learning · Computer Science 2022-02-15 Jieyu Zhang , Cheng-Yu Hsieh , Yue Yu , Chao Zhang , Alexander Ratner

Universalizing Weak Supervision

Weak supervision (WS) frameworks are a popular way to bypass hand-labeling large datasets for training data-hungry models. These approaches synthesize multiple noisy but cheaply-acquired estimates of labels into a set of high-quality…

Machine Learning · Computer Science 2023-11-30 Changho Shin , Winfred Li , Harit Vishwakarma , Nicholas Roberts , Frederic Sala

Weak Supervision Performance Evaluation via Partial Identification

Programmatic Weak Supervision (PWS) enables supervised model training without direct access to ground truth labels, utilizing weak labels from heuristics, crowdsourcing, or pre-trained models. However, the absence of ground truth…

Machine Learning · Statistics 2024-11-01 Felipe Maia Polo , Subha Maity , Mikhail Yurochkin , Moulinath Banerjee , Yuekai Sun

End-to-End Weak Supervision

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art…

Machine Learning · Computer Science 2021-12-01 Salva Rühling Cachay , Benedikt Boecking , Artur Dubrawski

A Benchmark Generative Probabilistic Model for Weak Supervised Learning

Finding relevant and high-quality datasets to train machine learning models is a major bottleneck for practitioners. Furthermore, to address ambitious real-world use-cases there is usually the requirement that the data come labelled with…

Machine Learning · Computer Science 2023-10-05 Georgios Papadopoulos , Fran Silavong , Sean Moran

Learning from Multiple Noisy Partial Labelers

Programmatic weak supervision creates models without hand-labeled training data by combining the outputs of heuristic labelers. Existing frameworks make the restrictive assumption that labelers output a single class label. Enabling users to…

Machine Learning · Computer Science 2022-03-28 Peilin Yu , Tiffany Ding , Stephen H. Bach

Label Augmentation with Reinforced Labeling for Weak Supervision

Weak supervision (WS) is an alternative to the traditional supervised learning to address the need for ground truth. Data programming is a practical WS approach that allows programmatic labeling data samples using labeling functions (LFs)…

Machine Learning · Computer Science 2022-04-14 Gürkan Solmaz , Flavio Cirillo , Fabio Maresca , Anagha Gode Anil Kumar

Refining Labeling Functions with Limited Labeled Data

Programmatic weak supervision (PWS) significantly reduces human effort for labeling data by combining the outputs of user-provided labeling functions (LFs) on unlabeled datapoints. However, the quality of the generated labels depends…

Machine Learning · Computer Science 2025-06-05 Chenjie Li , Amir Gilad , Boris Glavic , Zhengjie Miao , Sudeepa Roy

AutoWS: Automated Weak Supervision Framework for Text Classification

Creating large, good quality labeled data has become one of the major bottlenecks for developing machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot…

Computation and Language · Computer Science 2023-02-08 Abhinav Bohra , Huy Nguyen , Devashish Khatwani

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed…

Machine Learning · Computer Science 2023-11-28 Nicholas Roberts , Xintong Li , Tzu-Heng Huang , Dyah Adila , Spencer Schoenberg , Cheng-Yu Liu , Lauren Pick , Haotian Ma , Aws Albarghouthi , Frederic Sala

Fusing Conditional Submodular GAN and Programmatic Weak Supervision

Programmatic Weak Supervision (PWS) and generative models serve as crucial tools that enable researchers to maximize the utility of existing datasets without resorting to laborious data gathering and manual annotation processes. PWS uses…

Computer Vision and Pattern Recognition · Computer Science 2023-12-19 Kumar Shubham , Pranav Sastry , Prathosh AP

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning. In this setting, users design heuristics that provide noisy labels for subsets of the…

Machine Learning · Computer Science 2023-10-06 Dylan Sam , J. Zico Kolter

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…

Machine Learning · Computer Science 2022-02-09 Chidubem Arachie , Bert Huang

Creating Training Sets via Weak Indirect Supervision

Creating labeled training sets has become one of the major roadblocks in machine learning. To address this, recent \emph{Weak Supervision (WS)} frameworks synthesize training labels from multiple potentially noisy supervision sources.…

Machine Learning · Computer Science 2022-03-16 Jieyu Zhang , Bohan Wang , Xiangchen Song , Yujing Wang , Yaming Yang , Jing Bai , Alexander Ratner

Semi-Supervised Data Programming with Subset Selection

The paradigm of data programming, which uses weak supervision in the form of rules/labelling functions, and semi-supervised learning, which augments small amounts of labelled data with a large unlabelled dataset, have shown great promise in…

Machine Learning · Computer Science 2021-06-15 Ayush Maheshwari , Oishik Chatterjee , KrishnaTeja Killamsetty , Ganesh Ramakrishnan , Rishabh Iyer

Learning to Robustly Aggregate Labeling Functions for Semi-supervised Data Programming

A critical bottleneck in supervised machine learning is the need for large amounts of labeled data which is expensive and time consuming to obtain. However, it has been shown that a small amount of labeled data, while insufficient to…

Machine Learning · Computer Science 2022-03-11 Ayush Maheshwari , Krishnateja Killamsetty , Ganesh Ramakrishnan , Rishabh Iyer , Marina Danilevsky , Lucian Popa

SepLL: Separating Latent Class Labels from Weak Supervision Noise

In the weakly supervised learning paradigm, labeling functions automatically assign heuristic, often noisy, labels to data samples. In this work, we provide a method for learning from weak labels by separating two types of complementary…

Machine Learning · Computer Science 2022-10-26 Andreas Stephan , Vasiliki Kougia , Benjamin Roth

Reliable Programmatic Weak Supervision with Confidence Intervals for Label Probabilities

The accurate labeling of datasets is often both costly and time-consuming. Given an unlabeled dataset, programmatic weak supervision obtains probabilistic predictions for the labels by leveraging multiple weak labeling functions (LFs) that…

Machine Learning · Statistics 2025-08-07 Verónica Álvarez , Santiago Mazuelas , Steven An , Sanjoy Dasgupta

Lifting Weak Supervision To Structured Prediction

Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources. WS is theoretically well understood for binary classification, where…

Machine Learning · Computer Science 2022-11-28 Harit Vishwakarma , Nicholas Roberts , Frederic Sala