Related papers: String Indexing with Compressed Patterns

Compressed Indexing with Signature Grammars

The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…

Data Structures and Algorithms · Computer Science 2018-04-12 Anders Roy Christiansen , Mikko Berggren Ettienne

The Wavelet Trie: Maintaining an Indexed Sequence of Strings in Compressed Space

An indexed sequence of strings is a data structure for storing a string sequence that supports random access, searching, range counting and analytics operations, both for exact matches and prefix search. String sequences lie at the core of…

Data Structures and Algorithms · Computer Science 2012-04-17 Roberto Grossi , Giuseppe Ottaviano

Queries on LZ-Bounded Encodings

We describe a data structure that stores a string $S$ in space similar to that of its Lempel-Ziv encoding and efficiently supports access, rank and select queries. These queries are fundamental for implementing succinct and compressed data…

Data Structures and Algorithms · Computer Science 2014-12-03 Djamal Belazzougui , Travis Gagie , Paweł Gawrychowski , Juha Kärkkäinen , Alberto Ordóñez , Simon J. Puglisi , Yasuo Tabei

Compressed Dictionary Matching on Run-Length Encoded Strings

Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…

Data Structures and Algorithms · Computer Science 2025-09-04 Philip Bille , Inge Li Gørtz , Simon J. Puglisi , Simon R. Tarnow

Compressed Range Minimum Queries

Given a string $S$ of $n$ integers in $[0,\sigma)$, a range minimum query RMQ$(i, j)$ asks for the index of the smallest integer in $S[i \dots j]$. It is well known that the problem can be solved with a succinct data structure of size $2n +…

Data Structures and Algorithms · Computer Science 2019-05-30 Paweł Gawrychowski , Seungbum Jo , Shay Mozes , Oren Weimann

Time-Space Trade-Offs for Lempel-Ziv Compressed Indexing

Given a string $S$, the \emph{compressed indexing problem} is to preprocess $S$ into a compressed representation that supports fast \emph{substring queries}. The goal is to use little space relative to the compressed size of $S$ while…

Data Structures and Algorithms · Computer Science 2018-01-10 Philip Bille , Mikko Berggren Ettienne , Inge Li Gørtz , Hjalte Wedel Vildhøj

Compressed String Dictionaries

The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing…

Data Structures and Algorithms · Computer Science 2011-01-31 Nieves R. Brisaboa , Rodrigo Cánovas , Miguel A. Martínez-Prieto , Gonzalo Navarro

Indexing Highly Repetitive String Collections

Two decades ago, a breakthrough in indexing string collections made it possible to represent them within their compressed space while at the same time offering indexed search functionalities. As this new technology permeated through…

Data Structures and Algorithms · Computer Science 2022-11-28 Gonzalo Navarro

Subpath Queries on Compressed Graphs: a Survey

Text indexing is a classical algorithmic problem that has been studied for over four decades: given a text $T$, pre-process it off-line so that, later, we can quickly count and locate the occurrences of any string (the query pattern) in $T$…

Data Structures and Algorithms · Computer Science 2020-12-15 Nicola Prezza

Optimal Random Access and Conditional Lower Bounds for 2D Compressed Strings

Compressed indexing is a powerful technique that enables efficient querying over data stored in compressed form, significantly reducing memory usage and often accelerating computation. While extensive progress has been made for…

Data Structures and Algorithms · Computer Science 2025-10-23 Rajat De , Dominik Kempa

Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression…

Data Structures and Algorithms · Computer Science 2011-04-22 Pawel Gawrychowski

Faster subsequence recognition in compressed strings

Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to…

Data Structures and Algorithms · Computer Science 2011-11-10 Alexander Tiskin

Top Tree Compression of Tries

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann

Self-Index based on LZ77 (thesis)

Domains like bioinformatics, version control systems, collaborative editing systems (wiki), and others, are producing huge data collections that are very repetitive. That is, there are few differences between the elements of the collection.…

Data Structures and Algorithms · Computer Science 2011-12-21 Sebastian Kreft , Gonzalo Navarro

Solving Classical String Problems on Compressed Texts

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing…

Data Structures and Algorithms · Computer Science 2007-05-23 Yury Lifshits

Decompressing Lempel-Ziv Compressed Text

We consider the problem of decompressing the Lempel--Ziv 77 representation of a string $S$ of length $n$ using a working space as close as possible to the size $z$ of the input. The folklore solution for the problem runs in $O(n)$ time but…

Data Structures and Algorithms · Computer Science 2019-11-05 Philip Bille , Mikko Berggren Ettienne , Travis Gagie , Inge Li Gørtz , Nicola Prezza

Engineering Fast and Space-Efficient Recompression from SLP-Compressed Text

Compressed indexing enables powerful queries over massive and repetitive textual datasets using space proportional to the compressed input. While theoretical advances have led to highly efficient index structures, their practical…

Data Structures and Algorithms · Computer Science 2025-10-24 Ankith Reddy Adudodla , Dominik Kempa

Universal Indexes for Highly Repetitive Document Collections

Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that…

Information Retrieval · Computer Science 2016-05-25 Francisco Claude , Antonio Fariña , Miguel A. Martínez-Prieto , Gonzalo Navarro

Compressed Indexing for Consecutive Occurrences

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…

Data Structures and Algorithms · Computer Science 2023-04-04 Paweł Gawrychowski , Garance Gourdel , Tatiana Starikovskaya , Teresa Anna Steiner

Improved Compressed String Dictionaries

We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix…

Data Structures and Algorithms · Computer Science 2019-11-20 Nieves R. Brisaboa , Ana Cerdeira-Pena , Guillermo de Bernardo , Gonzalo Navarro