English

Practical Entropy-Compressed Rank/Select Dictionary

Data Structures and Algorithms 2007-05-23 v1

Abstract

Rank/Select dictionaries are data structures for an ordered set S{0,1,...,n1}S \subset \{0,1,...,n-1\} to compute \rank(x,S)\rank(x,S) (the number of elements in SS which are no greater than xx), and \select(i,S)\select(i,S) (the ii-th smallest element in SS), which are the fundamental components of \emph{succinct data structures} of strings, trees, graphs, etc. In those data structures, however, only asymptotic behavior has been considered and their performance for real data is not satisfactory. In this paper, we propose novel four Rank/Select dictionaries, esp, recrank, vcode and sdarray, each of which is small if the number of elements in SS is small, and indeed close to nH0(S)nH_0(S) (H0(S)1H_0(S) \leq 1 is the zero-th order \textit{empirical entropy} of SS) in practice, and its query time is superior to the previous ones. Experimental results reveal the characteristics of our data structures and also show that these data structures are superior to existing implementations in both size and query time.

Keywords

Cite

@article{arxiv.cs/0610001,
  title  = {Practical Entropy-Compressed Rank/Select Dictionary},
  author = {Daisuke Okanohara and Kunihiko Sadakane},
  journal= {arXiv preprint arXiv:cs/0610001},
  year   = {2007}
}