List-Decodable Linear Regression
Abstract
We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than fraction of examples. For any , our algorithm takes as input a sample of linear equations where of the equations satisfy for some small noise and of the equations are {\em arbitrarily} chosen. It outputs a list of size - a fixed constant - that contains an that is close to . Our algorithm succeeds whenever the inliers are chosen from a \emph{certifiably} anti-concentrated distribution . In particular, this gives a time algorithm to find a size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in \emph{regular} directions, we give an algorithm that achieves similar guarantee under the promise that has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm is based on a new framework for list-decodable learning that strengthens the `identifiability to algorithms' paradigm based on the sum-of-squares method. In an independent and concurrent work, Raghavendra and Yau also used the Sum-of-Squares method to give a similar result for list-decodable regression.
Cite
@article{arxiv.1905.05679,
title = {List-Decodable Linear Regression},
author = {Sushrut Karmalkar and Adam R. Klivans and Pravesh K. Kothari},
journal= {arXiv preprint arXiv:1905.05679},
year = {2019}
}
Comments
28 Pages