English

Better Streaming Algorithms for the Maximum Coverage Problem

Data Structures and Algorithms 2018-05-11 v6

Abstract

We study the classic NP-Hard problem of finding the maximum kk-set coverage in the data stream model: given a set system of mm sets that are subsets of a universe {1,,n}\{1,\ldots,n \}, find the kk sets that cover the most number of distinct elements. The problem can be approximated up to a factor 11/e1-1/e in polynomial time. In the streaming-set model, the sets and their elements are revealed online. The main goal of our work is to design algorithms, with approximation guarantees as close as possible to 11/e1-1/e, that use sublinear space o(mn)o(mn). Our main results are: Two (11/eϵ)(1-1/e-\epsilon) approximation algorithms: One uses O(ϵ1)O(\epsilon^{-1}) passes and O~(ϵ2k)\tilde{O}(\epsilon^{-2} k) space whereas the other uses only a single pass but O~(ϵ2m)\tilde{O}(\epsilon^{-2} m) space. We show that any approximation factor better than (1(11/k)k)(1-(1-1/k)^k) in constant passes requires Ω(m)\Omega(m) space for constant kk even if the algorithm is allowed unbounded processing time. We also demonstrate a single-pass, (1ϵ)(1-\epsilon) approximation algorithm using O~(ϵ2mmin(k,ϵ1))\tilde{O}(\epsilon^{-2} m \cdot \min(k,\epsilon^{-1})) space. We also study the maximum kk-vertex coverage problem in the dynamic graph stream model. In this model, the stream consists of edge insertions and deletions of a graph on NN vertices. The goal is to find kk vertices that cover the most number of distinct edges. We show that any constant approximation in constant passes requires Ω(N)\Omega(N) space for constant kk whereas O~(ϵ2N)\tilde{O}(\epsilon^{-2}N) space is sufficient for a (1ϵ)(1-\epsilon) approximation and arbitrary kk in a single pass. For regular graphs, we show that O~(ϵ3k)\tilde{O}(\epsilon^{-3}k) space is sufficient for a (1ϵ)(1-\epsilon) approximation in a single pass. We generalize this to a (κϵ)(\kappa-\epsilon) approximation when the ratio between the minimum and maximum degree is bounded below by κ\kappa.

Keywords

Cite

@article{arxiv.1610.06199,
  title  = {Better Streaming Algorithms for the Maximum Coverage Problem},
  author = {Andrew McGregor and Hoa T. Vu},
  journal= {arXiv preprint arXiv:1610.06199},
  year   = {2018}
}

Comments

- A preliminary version appeared in ICDT 2017 - Fix typos