English

Efficient Unbiased Sparsification

Information Theory 2024-07-25 v2 Machine Learning math.IT Statistics Theory Statistics Theory

Abstract

An unbiased mm-sparsification of a vector pRnp\in \mathbb{R}^n is a random vector QRnQ\in \mathbb{R}^n with mean pp that has at most m<nm<n nonzero coordinates. Unbiased sparsification compresses the original vector without introducing bias; it arises in various contexts, such as in federated learning and sampling sparse probability distributions. Ideally, unbiased sparsification should also minimize the expected value of a divergence function Div(Q,p)\mathsf{Div}(Q,p) that measures how far away QQ is from the original pp. If QQ is optimal in this sense, then we call it efficient. Our main results describe efficient unbiased sparsifications for divergences that are either permutation-invariant or additively separable. Surprisingly, the characterization for permutation-invariant divergences is robust to the choice of divergence function, in the sense that our class of optimal QQ for squared Euclidean distance coincides with our class of optimal QQ for Kullback-Leibler divergence, or indeed any of a wide variety of divergences.

Keywords

Cite

@article{arxiv.2402.14925,
  title  = {Efficient Unbiased Sparsification},
  author = {Leighton Barnes and Stephen Cameron and Timothy Chow and Emma Cohen and Keith Frankston and Benjamin Howard and Fred Kochman and Daniel Scheinerman and Jeffrey VanderKam},
  journal= {arXiv preprint arXiv:2402.14925},
  year   = {2024}
}