English

Suffixient Sets

Data Structures and Algorithms 2024-06-06 v3

Abstract

We define a suffixient set for a text T[1..n]T [1..n] to be a set SS of positions between 1 and nn such that, for any edge descending from a node uu to a node vv in the suffix tree of TT, there is an element sSs \in S such that uu's path label is a suffix of T[1..s1]T [1..s - 1] and T[s]T [s] is the first character of (u,v)(u, v)'s edge label. We first show there is a suffixient set of cardinality at most 2rˉ2 \bar{r}, where rˉ\bar{r} is the number of runs in the Burrows-Wheeler Transform of the reverse of TT. We then show that, given a straight-line program for TT with gg rules, we can build an O(rˉ+g)O (\bar{r} + g)-space index with which, given a pattern P[1..m]P [1..m], we can find the maximal exact matches (MEMs) of PP with respect to TT in O(mlog(σ)/logn+dlogn)O (m \log (\sigma) / \log n + d \log n) time, where σ\sigma is the size of the alphabet and dd is the number of times we would fully or partially descend edges in the suffix tree of TT while finding those MEMs.

Keywords

Cite

@article{arxiv.2312.01359,
  title  = {Suffixient Sets},
  author = {Lore Depuydt and Travis Gagie and Ben Langmead and Giovanni Manzini and Nicola Prezza},
  journal= {arXiv preprint arXiv:2312.01359},
  year   = {2024}
}
R2 v1 2026-06-28T13:39:32.637Z