Bounds for Compression in Streaming Models
Abstract
Compression algorithms and streaming algorithms are both powerful tools for dealing with massive data sets, but many of the best compression algorithms -- e.g., those based on the Burrows-Wheeler Transform -- at first seem incompatible with streaming. In this paper we consider several popular streaming models and ask in which, if any, we can compress as well as we can with the BWT. We first prove a nearly tight tradeoff between memory and redundancy for the Standard, Multipass and W-Streams models, demonstrating a bound that is achievable with the BWT but unachievable in those models. We then show we can compute the related Schindler Transform in the StreamSort model and the BWT in the Read-Write model and, thus, achieve that bound.
Cite
@article{arxiv.0711.3338,
title = {Bounds for Compression in Streaming Models},
author = {Travis Gagie},
journal= {arXiv preprint arXiv:0711.3338},
year = {2008}
}
Comments
added reduction from sorting to the Burrows-Wheeler Transform; thus, Grohe and Schweikardt's lower bound for short-sorting implies the same lower bound for the BWT