Generalized Straight-Line Programs

Gonzalo Navarro; Francisco Olivares; Cristian Urbina

Generalized Straight-Line Programs

Data Structures and Algorithms 2024-04-11 v1

Authors: Gonzalo Navarro , Francisco Olivares , Cristian Urbina

Abstract

It was recently proved that any Straight-Line Program (SLP) generating a given string can be transformed in linear time into an equivalent balanced SLP of the same asymptotic size. We generalize this proof to a general class of grammars we call Generalized SLPs (GSLPs), which allow rules of the form $A \rightarrow x$ where $x$ is any Turing-complete representation (of size $|x|$ ) of a sequence of symbols (potentially much longer than $|x|$ ). We then specialize GSLPs to so-called Iterated SLPs (ISLPs), which allow rules of the form $A \rightarrow \Pi_{i=k_1}^{k_2} B_1^{i^{c_1}}\cdots B_t^{i^{c_t}}$ of size $2t+2$ . We prove that ISLPs break, for some text families, the measure $\delta$ based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness. Further, ISLPs can extract any substring of length $\lambda$ , from the represented text $T[1.. n]$ , in time $O(\lambda + \log^2 n\log\log n)$ . This is the first compressed representation for repetitive texts breaking $\delta$ while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. We also show how to compute some substring queries, like range minima and next/previous smaller value, in time $O(\log^2 n \log\log n)$ . Finally, we further specialize the grammars to Run-Length SLPs (RLSLPs), which restrict the rules allowed by ISLPs to the form $A \rightarrow B^t$ . Apart from inheriting all the previous results with the term $\log^2 n \log\log n$ reduced to the near-optimal $\log n$ , we show that RLSLPs can exploit balance to efficiently compute a wide class of substring queries we call ``composable'' -- i.e., $f(X \cdot Y)$ can be obtained from $f(X)$ and $f(Y)$ ...

Keywords

string algorithms succinct data structure computational complexity

Cite

@article{arxiv.2404.07057,
  title  = {Generalized Straight-Line Programs},
  author = {Gonzalo Navarro and Francisco Olivares and Cristian Urbina},
  journal= {arXiv preprint arXiv:2404.07057},
  year   = {2024}
}

Comments

This work is an extended version of articles published in SPIRE 2022 and LATIN 2024, which are now integrated into a coherent framework where specialized results are derived from more general ones, new operations are supported, and proofs are complete. arXiv admin note: substantial text overlap with arXiv:2402.09232

Generalized Straight-Line Programs

Abstract

Keywords

Cite

Comments

Related papers