English

Maximum Coverage in the Data Stream Model: Parameterized and Generalized

Data Structures and Algorithms 2021-02-18 v1

Abstract

We present algorithms for the Max-Cover and Max-Unique-Cover problems in the data stream model. The input to both problems are mm subsets of a universe of size nn and a value k[m]k\in [m]. In Max-Cover, the problem is to find a collection of at most kk sets such that the number of elements covered by at least one set is maximized. In Max-Unique-Cover, the problem is to find a collection of at most kk sets such that the number of elements covered by exactly one set is maximized. Our goal is to design single-pass algorithms that use space that is sublinear in the input size. Our main algorithmic results are: If the sets have size at most dd, there exist single-pass algorithms using O~(dd+1kd)\tilde{O}(d^{d+1} k^d) space that solve both problems exactly. This is optimal up to polylogarithmic factors for constant dd. If each element appears in at most rr sets, we present single pass algorithms using O~(k2r/ϵ3)\tilde{O}(k^2 r/\epsilon^3) space that return a 1+ϵ1+\epsilon approximation in the case of Max-Cover. We also present a single-pass algorithm using slightly more memory, i.e., O~(k3r/ϵ4)\tilde{O}(k^3 r/\epsilon^{4}) space, that 1+ϵ1+\epsilon approximates Max-Unique-Cover. In contrast to the above results, when dd and rr are arbitrary, any constant pass 1+ϵ1+\epsilon approximation algorithm for either problem requires Ω(ϵ2m)\Omega(\epsilon^{-2}m) space but a single pass O(ϵ2mk)O(\epsilon^{-2}mk) space algorithm exists. In fact any constant-pass algorithm with an approximation better than e/(e1)e/(e-1) and e11/ke^{1-1/k} for Max-Cover and Max-Unique-Cover respectively requires Ω(m/k2)\Omega(m/k^2) space when dd and rr are unrestricted. En route, we also obtain an algorithm for a parameterized version of the streaming Set-Cover problem.

Keywords

Cite

@article{arxiv.2102.08476,
  title  = {Maximum Coverage in the Data Stream Model: Parameterized and Generalized},
  author = {Andrew McGregor and David Tench and Hoa T. Vu},
  journal= {arXiv preprint arXiv:2102.08476},
  year   = {2021}
}

Comments

Conference version to appear at ICDT 2021