Related papers: ScriptoriumWS: A Code Generation Assistant for Wea…

AutoWS: Automated Weak Supervision Framework for Text Classification

Creating large, good quality labeled data has become one of the major bottlenecks for developing machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot…

Computation and Language · Computer Science 2023-02-08 Abhinav Bohra , Huy Nguyen , Devashish Khatwani

Universalizing Weak Supervision

Weak supervision (WS) frameworks are a popular way to bypass hand-labeling large datasets for training data-hungry models. These approaches synthesize multiple noisy but cheaply-acquired estimates of labels into a set of high-quality…

Machine Learning · Computer Science 2023-11-30 Changho Shin , Winfred Li , Harit Vishwakarma , Nicholas Roberts , Frederic Sala

Lifting Weak Supervision To Structured Prediction

Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources. WS is theoretically well understood for binary classification, where…

Machine Learning · Computer Science 2022-11-28 Harit Vishwakarma , Nicholas Roberts , Frederic Sala

Creating Training Sets via Weak Indirect Supervision

Creating labeled training sets has become one of the major roadblocks in machine learning. To address this, recent \emph{Weak Supervision (WS)} frameworks synthesize training labels from multiple potentially noisy supervision sources.…

Machine Learning · Computer Science 2022-03-16 Jieyu Zhang , Bohan Wang , Xiangchen Song , Yujing Wang , Yaming Yang , Jing Bai , Alexander Ratner

Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming

Weak Supervision (WS) techniques allow users to efficiently create large training datasets by programmatically labeling data with heuristic sources of supervision. While the success of WS relies heavily on the provided labeling heuristics,…

Machine Learning · Computer Science 2022-10-25 Cheng-Yu Hsieh , Jieyu Zhang , Alexander Ratner

A Survey on Programmatic Weak Supervision

Labeling training data has become one of the major roadblocks to using machine learning. Among various weak supervision paradigms, programmatic weak supervision (PWS) has achieved remarkable success in easing the manual labeling bottleneck…

Machine Learning · Computer Science 2022-02-15 Jieyu Zhang , Cheng-Yu Hsieh , Yue Yu , Chao Zhang , Alexander Ratner

Weakly Supervised Scene Text Generation for Low-resource Languages

A large number of annotated training images is crucial for training successful scene text recognition models. However, collecting sufficient datasets can be a labor-intensive and costly process, particularly for low-resource languages. To…

Computer Vision and Pattern Recognition · Computer Science 2023-06-28 Yangchen Xie , Xinyuan Chen , Hongjian Zhan , Palaiahankote Shivakum , Bing Yin , Cong Liu , Yue Lu

Detecting Fake News with Weak Social Supervision

Limited labeled data is becoming the largest bottleneck for supervised learning systems. This is especially the case for many real-world tasks where large scale annotated examples are either too expensive to acquire or unavailable due to…

Social and Information Networks · Computer Science 2020-05-28 Kai Shu , Ahmed Hassan Awadallah , Susan Dumais , Huan Liu

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed…

Machine Learning · Computer Science 2023-11-28 Nicholas Roberts , Xintong Li , Tzu-Heng Huang , Dyah Adila , Spencer Schoenberg , Cheng-Yu Liu , Lauren Pick , Haotian Ma , Aws Albarghouthi , Frederic Sala

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling

Obtaining large annotated datasets is critical for training successful machine learning models and it is often a bottleneck in practice. Weak supervision offers a promising alternative for producing labeled datasets without ground truth…

Machine Learning · Computer Science 2021-01-27 Benedikt Boecking , Willie Neiswanger , Eric Xing , Artur Dubrawski

Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks

Weak supervision (WS) is a popular approach for label-efficient learning, leveraging diverse sources of noisy but inexpensive weak labels to automatically annotate training data. Despite its wide usage, WS and its practical value are…

Machine Learning · Computer Science 2025-01-31 Tianyi Zhang , Linrong Cai , Jeffrey Li , Nicholas Roberts , Neel Guha , Jinoh Lee , Frederic Sala

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…

Machine Learning · Computer Science 2022-02-09 Chidubem Arachie , Bert Huang

Weak Supervision with Incremental Source Accuracy Estimation

Motivated by the desire to generate labels for real-time data we develop a method to estimate the dependency structure and accuracy of weak supervision sources incrementally. Our method first estimates the dependency structure associated…

Machine Learning · Computer Science 2022-05-12 Richard Gresham Correro

Weakly-Supervised Neural Text Classification

Deep neural networks are gaining increasing popularity for the classic text classification task, due to their strong expressive power and less requirement for feature engineering. Despite such attractiveness, neural text classification…

Information Retrieval · Computer Science 2018-09-13 Yu Meng , Jiaming Shen , Chao Zhang , Jiawei Han

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning. In this setting, users design heuristics that provide noisy labels for subsets of the…

Machine Learning · Computer Science 2023-10-06 Dylan Sam , J. Zico Kolter

Generative Modeling Helps Weak Supervision (and Vice Versa)

Many promising applications of supervised machine learning face hurdles in the acquisition of labeled data in sufficient quantity and quality, creating an expensive bottleneck. To overcome such limitations, techniques that do not depend on…

Machine Learning · Computer Science 2023-03-14 Benedikt Boecking , Nicholas Roberts , Willie Neiswanger , Stefano Ermon , Frederic Sala , Artur Dubrawski

Learning to Selectively Learn for Weakly-supervised Paraphrase Generation

Paraphrase generation is a longstanding NLP task that has diverse applications for downstream NLP tasks. However, the effectiveness of existing efforts predominantly relies on large amounts of golden labeled data. Though unsupervised…

Computation and Language · Computer Science 2021-09-28 Kaize Ding , Dingcheng Li , Alexander Hanbo Li , Xing Fan , Chenlei Guo , Yang Liu , Huan Liu

Simpler Does It: Generating Semantic Labels with Objectness Guidance

Existing weakly or semi-supervised semantic segmentation methods utilize image or box-level supervision to generate pseudo-labels for weakly labeled images. However, due to the lack of strong supervision, the generated pseudo-labels are…

Computer Vision and Pattern Recognition · Computer Science 2021-10-22 Md Amirul Islam , Matthew Kowal , Sen Jia , Konstantinos G. Derpanis , Neil D. B. Bruce

Reliable Weakly Supervised Learning: Maximize Gain and Maintain Safeness

Weakly supervised data are widespread and have attracted much attention. However, since label quality is often difficult to guarantee, sometimes the use of weakly supervised data will lead to unsatisfactory performance, i.e., performance…

Machine Learning · Computer Science 2019-04-23 Lan-Zhe Guo , Yu-Feng Li , Ming Li , Jin-Feng Yi , Bo-Wen Zhou , Zhi-Hua Zhou

Self-Training with Weak Supervision

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such…

Computation and Language · Computer Science 2021-04-13 Giannis Karamanolakis , Subhabrata Mukherjee , Guoqing Zheng , Ahmed Hassan Awadallah