Testable Learning with Distribution Shift
Abstract
We revisit the fundamental problem of learning with distribution shift, in which a learner is given labeled samples from training distribution , unlabeled samples from test distribution and is asked to output a classifier with low test error. The standard approach in this setting is to bound the loss of a classifier in terms of some notion of distance between and . These distances, however, seem difficult to compute and do not lead to efficient algorithms. We depart from this paradigm and define a new model called testable learning with distribution shift, where we can obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution. In this model, a learner outputs a classifier with low test error whenever samples from and pass an associated test; moreover, the test must accept if the marginal of equals the marginal of . We give several positive results for learning well-studied concept classes such as halfspaces, intersections of halfspaces, and decision trees when the marginal of is Gaussian or uniform on . Prior to our work, no efficient algorithms for these basic cases were known without strong assumptions on . For halfspaces in the realizable case (where there exists a halfspace consistent with both and ), we combine a moment-matching approach with ideas from active learning to simulate an efficient oracle for estimating disagreement regions. To extend to the non-realizable setting, we apply recent work from testable (agnostic) learning. More generally, we prove that any function class with low-degree -sandwiching polynomial approximators can be learned in our model. We apply constructions from the pseudorandomness literature to obtain the required approximators.
Cite
@article{arxiv.2311.15142,
title = {Testable Learning with Distribution Shift},
author = {Adam R. Klivans and Konstantinos Stavropoulos and Arsen Vasilyan},
journal= {arXiv preprint arXiv:2311.15142},
year = {2024}
}
Comments
To appear in The 37th Annual Conference on Learning Theory (COLT 2024)