English

Predictive Inference with Weak Supervision

Machine Learning 2022-02-10 v2 Machine Learning

Abstract

The expense of acquiring labels in large-scale statistical machine learning makes partially and weakly-labeled data attractive, though it is not always apparent how to leverage such data for model fitting or validation. We present a methodology to bridge the gap between partial supervision and validation, developing a conformal prediction framework to provide valid predictive confidence sets -- sets that cover a true label with a prescribed probability, independent of the underlying distribution -- using weakly labeled data. To do so, we introduce a (necessary) new notion of coverage and predictive validity, then develop several application scenarios, providing efficient algorithms for classification and several large-scale structured prediction problems. We corroborate the hypothesis that the new coverage definition allows for tighter and more informative (but valid) confidence sets through several experiments.

Keywords

Cite

@article{arxiv.2201.08315,
  title  = {Predictive Inference with Weak Supervision},
  author = {Maxime Cauchois and Suyash Gupta and Alnur Ali and John Duchi},
  journal= {arXiv preprint arXiv:2201.08315},
  year   = {2022}
}