English
Related papers

Related papers: String Indexing with Compressed Patterns

200 papers

The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…

Data Structures and Algorithms · Computer Science 2018-04-12 Anders Roy Christiansen , Mikko Berggren Ettienne

An indexed sequence of strings is a data structure for storing a string sequence that supports random access, searching, range counting and analytics operations, both for exact matches and prefix search. String sequences lie at the core of…

Data Structures and Algorithms · Computer Science 2012-04-17 Roberto Grossi , Giuseppe Ottaviano

We describe a data structure that stores a string $S$ in space similar to that of its Lempel-Ziv encoding and efficiently supports access, rank and select queries. These queries are fundamental for implementing succinct and compressed data…

Data Structures and Algorithms · Computer Science 2014-12-03 Djamal Belazzougui , Travis Gagie , Paweł Gawrychowski , Juha Kärkkäinen , Alberto Ordóñez , Simon J. Puglisi , Yasuo Tabei

Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…

Data Structures and Algorithms · Computer Science 2025-09-04 Philip Bille , Inge Li Gørtz , Simon J. Puglisi , Simon R. Tarnow

Given a string $S$ of $n$ integers in $[0,\sigma)$, a range minimum query RMQ$(i, j)$ asks for the index of the smallest integer in $S[i \dots j]$. It is well known that the problem can be solved with a succinct data structure of size $2n +…

Data Structures and Algorithms · Computer Science 2019-05-30 Paweł Gawrychowski , Seungbum Jo , Shay Mozes , Oren Weimann

Given a string $S$, the \emph{compressed indexing problem} is to preprocess $S$ into a compressed representation that supports fast \emph{substring queries}. The goal is to use little space relative to the compressed size of $S$ while…

Data Structures and Algorithms · Computer Science 2018-01-10 Philip Bille , Mikko Berggren Ettienne , Inge Li Gørtz , Hjalte Wedel Vildhøj

The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing…

Data Structures and Algorithms · Computer Science 2011-01-31 Nieves R. Brisaboa , Rodrigo Cánovas , Miguel A. Martínez-Prieto , Gonzalo Navarro

Two decades ago, a breakthrough in indexing string collections made it possible to represent them within their compressed space while at the same time offering indexed search functionalities. As this new technology permeated through…

Data Structures and Algorithms · Computer Science 2022-11-28 Gonzalo Navarro

Text indexing is a classical algorithmic problem that has been studied for over four decades: given a text $T$, pre-process it off-line so that, later, we can quickly count and locate the occurrences of any string (the query pattern) in $T$…

Data Structures and Algorithms · Computer Science 2020-12-15 Nicola Prezza

Compressed indexing is a powerful technique that enables efficient querying over data stored in compressed form, significantly reducing memory usage and often accelerating computation. While extensive progress has been made for…

Data Structures and Algorithms · Computer Science 2025-10-23 Rajat De , Dominik Kempa

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression…

Data Structures and Algorithms · Computer Science 2011-04-22 Pawel Gawrychowski

Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to…

Data Structures and Algorithms · Computer Science 2011-11-10 Alexander Tiskin

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann

Domains like bioinformatics, version control systems, collaborative editing systems (wiki), and others, are producing huge data collections that are very repetitive. That is, there are few differences between the elements of the collection.…

Data Structures and Algorithms · Computer Science 2011-12-21 Sebastian Kreft , Gonzalo Navarro

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing…

Data Structures and Algorithms · Computer Science 2007-05-23 Yury Lifshits

We consider the problem of decompressing the Lempel--Ziv 77 representation of a string $S$ of length $n$ using a working space as close as possible to the size $z$ of the input. The folklore solution for the problem runs in $O(n)$ time but…

Data Structures and Algorithms · Computer Science 2019-11-05 Philip Bille , Mikko Berggren Ettienne , Travis Gagie , Inge Li Gørtz , Nicola Prezza

Compressed indexing enables powerful queries over massive and repetitive textual datasets using space proportional to the compressed input. While theoretical advances have led to highly efficient index structures, their practical…

Data Structures and Algorithms · Computer Science 2025-10-24 Ankith Reddy Adudodla , Dominik Kempa

Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that…

Information Retrieval · Computer Science 2016-05-25 Francisco Claude , Antonio Fariña , Miguel A. Martínez-Prieto , Gonzalo Navarro

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…

Data Structures and Algorithms · Computer Science 2023-04-04 Paweł Gawrychowski , Garance Gourdel , Tatiana Starikovskaya , Teresa Anna Steiner

We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix…

Data Structures and Algorithms · Computer Science 2019-11-20 Nieves R. Brisaboa , Ana Cerdeira-Pena , Guillermo de Bernardo , Gonzalo Navarro
‹ Prev 1 2 3 10 Next ›