Predictive Inference with Weak Supervision
Abstract
The expense of acquiring labels in large-scale statistical machine learning makes partially and weakly-labeled data attractive, though it is not always apparent how to leverage such data for model fitting or validation. We present a methodology to bridge the gap between partial supervision and validation, developing a conformal prediction framework to provide valid predictive confidence sets -- sets that cover a true label with a prescribed probability, independent of the underlying distribution -- using weakly labeled data. To do so, we introduce a (necessary) new notion of coverage and predictive validity, then develop several application scenarios, providing efficient algorithms for classification and several large-scale structured prediction problems. We corroborate the hypothesis that the new coverage definition allows for tighter and more informative (but valid) confidence sets through several experiments.
Cite
@article{arxiv.2201.08315,
title = {Predictive Inference with Weak Supervision},
author = {Maxime Cauchois and Suyash Gupta and Alnur Ali and John Duchi},
journal= {arXiv preprint arXiv:2201.08315},
year = {2022}
}