English

Efficiently computing runs on a trie

Data Structures and Algorithms 2021-04-21 v3

Abstract

A maximal repetition, or run, in a string, is a maximal periodic substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with nn edges, we show that the number of runs is less than nn. We also show an asymptotic lower bound on the maximum density of runs in tries: limnρT(n)/n0.993238\lim_{n\rightarrow\infty}\rho_\mathcal{T}(n)/n \geq 0.993238 where ρT(n)\rho_{\mathcal{T}}(n) is the maximum number of runs in a trie with nn edges. Furthermore, we also show an O(nloglogn)O(n\log \log n) time and O(n)O(n) space algorithm for finding all runs.

Keywords

Cite

@article{arxiv.1901.10633,
  title  = {Efficiently computing runs on a trie},
  author = {Ryo Sugahara and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda},
  journal= {arXiv preprint arXiv:1901.10633},
  year   = {2021}
}

Comments

an updated version of CPM 2019 paper (10.4230/LIPIcs.CPM.2019.23), submitted to a journal