English

Optimal Time and Space Construction of Suffix Arrays and LCP Arrays for Integer Alphabets

Data Structures and Algorithms 2019-07-16 v5

Abstract

Suffix arrays and LCP arrays are one of the most fundamental data structures widely used for various kinds of string processing. We consider two problems for a read-only string of length NN over an integer alphabet [1,,σ][1, \dots, \sigma] for 1σN1 \leq \sigma \leq N, the string contains σ\sigma distinct characters, the construction of the suffix array, and a simultaneous construction of both the suffix array and LCP array. For the word RAM model, we propose algorithms to solve both of the problems in O(N)O(N) time by using O(1)O(1) extra words, which are optimal in time and space. Extra words means the required space except for the space of the input string and output suffix array and LCP array. Our contribution improves the previous most efficient algorithms, O(N)O(N) time using σ+O(1)\sigma+O(1) extra words by [Nong, TOIS 2013] and O(NlogN)O(N \log N) time using O(1)O(1) extra words by [Franceschini and Muthukrishnan, ICALP 2007], for constructing suffix arrays, and it improves the previous most efficient solution that runs in O(N)O(N) time using σ+O(1)\sigma + O(1) extra words for constructing both suffix arrays and LCP arrays through a combination of [Nong, TOIS 2013] and [Manzini, SWAT 2004].

Keywords

Cite

@article{arxiv.1703.01009,
  title  = {Optimal Time and Space Construction of Suffix Arrays and LCP Arrays for Integer Alphabets},
  author = {Keisuke Goto},
  journal= {arXiv preprint arXiv:1703.01009},
  year   = {2019}
}