Limits on Gradient Compression for Stochastic Optimization

Prathamesh Mayekar; Himanshu Tyagi

Limits on Gradient Compression for Stochastic Optimization

Information Theory 2020-01-27 v1 math.IT

Authors: Prathamesh Mayekar , Himanshu Tyagi

Abstract

We consider stochastic optimization over $\ell_p$ spaces using access to a first-order oracle. We ask: {What is the minimum precision required for oracle outputs to retain the unrestricted convergence rates?} We characterize this precision for every $p\geq 1$ by deriving information theoretic lower bounds and by providing quantizers that (almost) achieve these lower bounds. Our quantizers are new and easy to implement. In particular, our results are exact for $p=2$ and $p=\infty$ , showing the minimum precision needed in these settings are $\Theta(d)$ and $\Theta(\log d)$ , respectively. The latter result is surprising since recovering the gradient vector will require $\Omega(d)$ bits.

Keywords

approximation algorithm

Cite

@article{arxiv.2001.09032,
  title  = {Limits on Gradient Compression for Stochastic Optimization},
  author = {Prathamesh Mayekar and Himanshu Tyagi},
  journal= {arXiv preprint arXiv:2001.09032},
  year   = {2020}
}

Related papers

View all related →