English

Approximate Sparse Linear Regression

Computational Geometry 2018-05-01 v2

Abstract

In the Sparse Linear Regression (SLR) problem, given a d×nd \times n matrix MM and a dd-dimensional query qq, the goal is to compute a kk-sparse nn-dimensional vector τ\tau such that the error Mτq||M \tau-q|| is minimized. This problem is equivalent to the following geometric problem: given a set PP of nn points and a query point qq in dd dimensions, find the closest kk-dimensional subspace to qq, that is spanned by a subset of kk points in PP. In this paper, we present data-structures/algorithms and conditional lower bounds for several variants of this problem (such as finding the closest induced kk dimensional flat/simplex instead of a subspace). In particular, we present approximation algorithms for the online variants of the above problems with query time O~(nk1)\tilde O(n^{k-1}), which are of interest in the "low sparsity regime" where kk is small, e.g., 22 or 33. For k=dk=d, this matches, up to polylogarithmic factors, the lower bound that relies on the affinely degenerate conjecture (i.e., deciding if nn points in Rd\mathbb{R}^d contains d+1d+1 points contained in a hyperplane takes Ω(nd)\Omega(n^d) time). Moreover, our algorithms involve formulating and solving several geometric subproblems, which we believe to be of independent interest.

Keywords

Cite

@article{arxiv.1609.08739,
  title  = {Approximate Sparse Linear Regression},
  author = {Sariel Har-Peled and Piotr Indyk and Sepideh Mahabadi},
  journal= {arXiv preprint arXiv:1609.08739},
  year   = {2018}
}