Related papers: Learning from Indirect Observations
High-quality labels are often very scarce, whereas unlabeled data with inferred weak labels occurs more naturally. In many cases, these weak labels dictate the frequency of each respective class over a set of instances. In this paper, we…
In many applications, training machine learning models involves using large amounts of human-annotated data. Obtaining precise labels for the data is expensive. Instead, training with weak supervision provides a low-cost alternative. We…
Curation of large fully supervised datasets has become one of the major roadblocks for machine learning. Weak supervision provides an alternative to supervised learning by training with cheap, noisy, and possibly correlated labeling…
The scarcity of data annotated at the desired level of granularity is a recurring issue in many applications. Significant amounts of effort have been devoted to developing weakly supervised methods tailored to each individual setting, which…
Neural network approaches have recently shown to be effective in several information retrieval (IR) tasks. However, neural approaches often require large volumes of training data to perform effectively, which is not always available. To…
Weakly supervised data are widespread and have attracted much attention. However, since label quality is often difficult to guarantee, sometimes the use of weakly supervised data will lead to unsatisfactory performance, i.e., performance…
In this paper, we propose a method for training neural networks when we have a large set of data with weak labels and a small amount of data with true labels. In our proposed model, we train two neural networks: a target network, the…
Contrastive learning has shown outstanding performances in both supervised and unsupervised learning, and has recently been introduced to solve weakly supervised learning problems such as semi-supervised learning and noisy label learning.…
Existing weak supervision approaches use all the data covered by weak signals to train a classifier. We show both theoretically and empirically that this is not always optimal. Intuitively, there is a tradeoff between the amount of…
Supervised learning usually requires a large amount of labelled data. However, attaining ground-truth labels is costly for many tasks. Alternatively, weakly supervised methods learn with cheap weak signals that only approximately label some…
Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set. In this work, we study the problem of training an…
Obtaining accurate class labels is often costly or unreliable, and may also be limited by privacy or other practical conditions. Compared with asking an annotator to provide the exact class, it is often easier to ask whether the true label…
We consider the task of training classifiers without labels. We propose a weakly supervised method---adversarial label learning---that trains classifiers to perform well against an adversary that chooses labels for training data. The weak…
Weakly supervised learning is a popular approach for training machine learning models in low-resource settings. Instead of requesting high-quality yet costly human annotations, it allows training models with noisy annotations obtained from…
Motivated by the desire to generate labels for real-time data we develop a method to estimate the dependency structure and accuracy of weak supervision sources incrementally. Our method first estimates the dependency structure associated…
Consider a classification problem where we do not have access to labels for individual training examples, but only have average labels over subpopulations. We give practical examples of this setup and show how such a classification task can…
We introduce an adaptive method with formal quality guarantees for weak supervision in a non-stationary setting. Our goal is to infer the unknown labels of a sequence of data by using weak supervision sources that provide independent noisy…
We introduce Integrated Weak Learning, a principled framework that integrates weak supervision into the training process of machine learning models. Our approach jointly trains the end-model and a label model that aggregates multiple…
Training deep neural networks requires massive amounts of training data, but for many tasks only limited labeled data is available. This makes weak supervision attractive, using weak or noisy signals like the output of heuristic methods or…
We study generalization properties of weakly supervised learning. That is, learning where only a few "strong" labels (the actual target of our prediction) are present but many more "weak" labels are available. In particular, we show that…