English

Streaming Algorithms with Large Approximation Factors

Data Structures and Algorithms 2022-07-19 v1

Abstract

We initiate a broad study of classical problems in the streaming model with insertions and deletions in the setting where we allow the approximation factor α\alpha to be much larger than 11. Such algorithms can use significantly less memory than the usual setting for which α=1+ϵ\alpha = 1+\epsilon for an ϵ(0,1)\epsilon \in (0,1). We study large approximations for a number of problems in sketching and streaming and the following are some of our results. For the p\ell_p norm/quasinorm xp\|x\|_p of an nn-dimensional vector xx, 0<p20 < p \le 2, we show that obtaining a \poly(n)\poly(n)-approximation requires the same amount of memory as obtaining an O(1)O(1)-approximation for any M=nΘ(1)M = n^{\Theta(1)}. For estimating the p\ell_p norm, p>2p > 2, we show an upper bound of O(n12/p(lognlogM)/α2)O(n^{1-2/p} (\log n \allowbreak \log M)/\alpha^{2}) bits for an α\alpha-approximation, and give a matching lower bound, for almost the full range of α1\alpha \geq 1 for linear sketches. For the 2\ell_2-heavy hitters problem, we show that the known lower bound of Ω(klognlogM)\Omega(k \log n\log M) bits for identifying (1/k)(1/k)-heavy hitters holds even if we are allowed to output items that are 1/(αk)1/(\alpha k)-heavy, for almost the full range of α\alpha, provided the algorithm succeeds with probability 1O(1/n)1-O(1/n). We also obtain a lower bound for linear sketches that is tight even for constant probability algorithms. For estimating the number 0\ell_0 of distinct elements, we give an n1/tn^{1/t}-approximation algorithm using O(tloglogM)O(t\log \log M) bits of space, as well as a lower bound of Ω(t)\Omega(t) bits, both excluding the storage of random bits.

Keywords

Cite

@article{arxiv.2207.08075,
  title  = {Streaming Algorithms with Large Approximation Factors},
  author = {Yi Li and Honghao Lin and David P. Woodruff and Yuheng Zhang},
  journal= {arXiv preprint arXiv:2207.08075},
  year   = {2022}
}