English

Deterministic Indexing for Packed Strings

Data Structures and Algorithms 2016-12-07 v1

Abstract

Given a string SS of length nn, the classic string indexing problem is to preprocess SS into a compact data structure that supports efficient subsequent pattern queries. In the \emph{deterministic} variant the goal is to solve the string indexing problem without any randomization (at preprocessing time or query time). In the \emph{packed} variant the strings are stored with several character in a single word, giving us the opportunity to read multiple characters simultaneously. Our main result is a new string index in the deterministic \emph{and} packed setting. Given a packed string SS of length nn over an alphabet σ\sigma, we show how to preprocess SS in O(n)O(n) (deterministic) time and space O(n)O(n) such that given a packed pattern string of length mm we can support queries in (deterministic) time O(m/α+logm+loglogσ),O\left(m/\alpha + \log m + \log \log \sigma\right), where α=w/logσ\alpha = w / \log \sigma is the number of characters packed in a word of size w=Θ(logn)w = \Theta(\log n). Our query time is always at least as good as the previous best known bounds and whenever several characters are packed in a word, i.e., logσw\log \sigma \ll w, the query times are faster.

Keywords

Cite

@article{arxiv.1612.01748,
  title  = {Deterministic Indexing for Packed Strings},
  author = {Philip Bille and Inge Li Gørtz and Frederik Rye Skjoldjensen},
  journal= {arXiv preprint arXiv:1612.01748},
  year   = {2016}
}