English

Linear pattern matching on sparse suffix trees

Data Structures and Algorithms 2015-03-19 v1

Abstract

Packing several characters into one computer word is a simple and natural way to compress the representation of a string and to speed up its processing. Exploiting this idea, we propose an index for a packed string, based on a {\em sparse suffix tree} \cite{KU-96} with appropriately defined suffix links. Assuming, under the standard unit-cost RAM model, that a word can store up to logσn\log_{\sigma}n characters (σ\sigma the alphabet size), our index takes O(n/logσn)O(n/\log_{\sigma}n) space, i.e. the same space as the packed string itself. The resulting pattern matching algorithm runs in time O(m+r2+rocc)O(m+r^2+r\cdot occ), where mm is the length of the pattern, rr is the actual number of characters stored in a word and occocc is the number of pattern occurrences.

Keywords

Cite

@article{arxiv.1103.2613,
  title  = {Linear pattern matching on sparse suffix trees},
  author = {Roman Kolpakov and Gregory Kucherov and Tatiana Starikovskaya},
  journal= {arXiv preprint arXiv:1103.2613},
  year   = {2015}
}
R2 v1 2026-06-21T17:39:03.594Z