Compressed Indexing for Consecutive Occurrences

Paweł Gawrychowski; Garance Gourdel; Tatiana Starikovskaya; Teresa Anna Steiner

Compressed Indexing for Consecutive Occurrences

Data Structures and Algorithms 2023-04-04 v1

Authors: Paweł Gawrychowski , Garance Gourdel , Tatiana Starikovskaya , Teresa Anna Steiner

Abstract

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern $P$ . However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns $P_{1}$ and $P_{2}$ and a range $[a,b]$ , and one must find all consecutive occurrences $(q_1,q_2)$ of $P_{1}$ and $P_{2}$ such that $q_2-q_1 \in [a,b]$ . By their results, we cannot hope for a very efficient indexing structure for such queries, even if $a=0$ is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.

Keywords

string algorithms succinct data structure computational complexity

Cite

@article{arxiv.2304.00887,
  title  = {Compressed Indexing for Consecutive Occurrences},
  author = {Paweł Gawrychowski and Garance Gourdel and Tatiana Starikovskaya and Teresa Anna Steiner},
  journal= {arXiv preprint arXiv:2304.00887},
  year   = {2023}
}

Comments

This is a full version of a paper accepted to CPM 2023

Compressed Indexing for Consecutive Occurrences

Abstract

Keywords

Cite

Comments

Related papers