English

Faster fully compressed pattern matching by recompression

Data Structures and Algorithms 2013-06-26 v4

Abstract

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way. This technique yields an O((n+m)log M) algorithm for compressed pattern matching, assuming that M fits in O(1) machine words, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. If only m+n fits in O(1) machine words, the running time increases to O((n+m)log M log(n+m)). The previous best algorithm due to Lifshits had O(n^2m) running time.

Keywords

Cite

@article{arxiv.1111.3244,
  title  = {Faster fully compressed pattern matching by recompression},
  author = {Artur Jeż},
  journal= {arXiv preprint arXiv:1111.3244},
  year   = {2013}
}

Comments

Full version, submitted to a journal as is. Overall improvements over the previous version

R2 v1 2026-06-21T19:35:47.473Z