Compressed Indexing for Consecutive Occurrences
Abstract
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern . However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns and and a range , and one must find all consecutive occurrences of and such that . By their results, we cannot hope for a very efficient indexing structure for such queries, even if is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.
Cite
@article{arxiv.2304.00887,
title = {Compressed Indexing for Consecutive Occurrences},
author = {Paweł Gawrychowski and Garance Gourdel and Tatiana Starikovskaya and Teresa Anna Steiner},
journal= {arXiv preprint arXiv:2304.00887},
year = {2023}
}
Comments
This is a full version of a paper accepted to CPM 2023