Related papers: Dynamic Suffix Array in Optimal Compressed Space

Collapsing the Hierarchy of Compressed Data Structures: Suffix Arrays in Optimal Compressed Space

In the last decades, the necessity to process massive amounts of textual data fueled the development of compressed text indexes: data structures efficiently answering queries on a given text while occupying space proportional to the…

Data Structures and Algorithms · Computer Science 2024-09-24 Dominik Kempa , Tomasz Kociumaka

Dynamic Suffix Array with Polylogarithmic Queries and Updates

The suffix array $SA[1..n]$ of a text $T$ of length $n$ is a permutation of $\{1,\ldots,n\}$ describing the lexicographical ordering of suffixes of $T$, and it is considered to be among of the most important data structures in string…

Data Structures and Algorithms · Computer Science 2022-06-17 Dominik Kempa , Tomasz Kociumaka

Dynamic Suffix Array with Sub-linear update time and Poly-logarithmic Lookup Time

The Suffix Array $SA_S[1\ldots n]$ of an $n$-length string $S$ is a lexicographically sorted array of the suffixes of $S$. The suffix array is one of the most well known and widely used data structures in string algorithms. We present a…

Data Structures and Algorithms · Computer Science 2021-12-24 Amihood Amir , Itai Boneh

Tight Lower Bounds for Central String Queries in Compressed Space

In this work, we study the limits of compressed data structures, i.e., structures that support various queries on an input text $T\in\Sigma^n$ using space proportional to the size of $T$ in compressed form. Nearly all fundamental queries…

Data Structures and Algorithms · Computer Science 2025-10-23 Dominik Kempa , Tomasz Kociumaka

Breaking the $O(n)$-Barrier in the Construction of Compressed Suffix Arrays and Suffix Trees

The suffix array and the suffix tree are the two most fundamental data structures for string processing. For a length-$n$ text, however, they use $\Theta(n \log n)$ bits of space, which is often too costly. To address this, Grossi and…

Data Structures and Algorithms · Computer Science 2023-04-20 Dominik Kempa , Tomasz Kociumaka

Update Query Time Trade-off for dynamic Suffix Arrays

The Suffix Array SA(S) of a string S[1 ... n] is an array containing all the suffixes of S sorted by lexicographic order. The suffix array is one of the most well known indexing data structures, and it functions as a key tool in many string…

Data Structures and Algorithms · Computer Science 2020-07-15 Amihood Amir , Itai Boneh

Compressed Data Structures for Dynamic Sequences

We consider the problem of storing a dynamic string $S$ over an alphabet $\Sigma=\{\,1,\ldots,\sigma\,\}$ in compressed form. Our representation supports insertions and deletions of symbols and answers three fundamental queries:…

Data Structures and Algorithms · Computer Science 2015-07-27 J. Ian Munro , Yakov Nekrich

Dynamic Grammar-Compressed Self-Index in $\delta$-Optimal Space

A compressed self-index stores a string in compressed form while supporting locate queries without decompression. For highly repetitive strings (arising in web crawls, versioned documents, and genomic collections), static self-indexes can…

Data Structures and Algorithms · Computer Science 2026-04-29 Takaaki Nishimoto , Yasuo Tabei

On the Use of Suffix Arrays for Memory-Efficient Lempel-Ziv Data Compression

Much research has been devoted to optimizing algorithms of the Lempel-Ziv (LZ) 77 family, both in terms of speed and memory requirements. Binary search trees and suffix trees (ST) are data structures that have been often used for this…

Data Structures and Algorithms · Computer Science 2016-11-17 Artur Ferreira , Arlindo Oliveira , Mario Figueiredo

Compressed Dynamic Range Majority and Minority Data Structures

In the range $\alpha$-majority query problem, we are given a sequence $S[1..n]$ and a fixed threshold $\alpha \in (0, 1)$, and are asked to preprocess $S$ such that, given a query range $[i..j]$, we can efficiently report the symbols that…

Data Structures and Algorithms · Computer Science 2018-05-24 Travis Gagie , Meng He , Gonzalo Navarro

Scalable and Efficient Construction of Suffix Array with MapReduce and In-Memory Data Store System

Suffix Array (SA) is a cardinal data structure in many pattern matching applications, including data compression, plagiarism detection and sequence alignment. However, as the volumes of data increase abruptly, the construction of SA is not…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-16 Hsiang-Huang Wu , Chien-Min Wang , Hsuan-Chi Kuo , Wei-Chun Chung , Jan-Ming Ho

Compressing Suffix Trees by Path Decompositions

The suffix tree is arguably the most fundamental data structure on strings: introduced by Weiner (SWAT 1973) and McCreight (JACM 1976), it allows solving a myriad of computational problems on strings in linear time. Motivated by its large…

Data Structures and Algorithms · Computer Science 2026-05-07 Ruben Becker , Davide Cenzato , Travis Gagie , Sung-Hwan Kim , Ragnar Groot Koerkamp , Giovanni Manzini , Nicola Prezza

Space-Efficient Construction of Compressed Indexes in Deterministic Linear Time

We show that the compressed suffix array and the compressed suffix tree of a string $T$ can be built in $O(n)$ deterministic time using $O(n\log\sigma)$ bits of space, where $n$ is the string length and $\sigma$ is the alphabet size.…

Data Structures and Algorithms · Computer Science 2016-11-15 J. Ian Munro , Gonzalo Navarro , Yakov Nekrich

Explaining the Inherent Tradeoffs for Suffix Array Functionality: Equivalences between String Problems and Prefix Range Queries

We study the fundamental question of how efficiently suffix array entries can be accessed when the array cannot be stored explicitly. The suffix array $SA_T[1..n]$ of a text $T$ of length $n$ encodes the lexicographic order of its suffixes…

Data Structures and Algorithms · Computer Science 2025-10-23 Dominik Kempa , Tomasz Kociumaka

Compressed Spaced Suffix Arrays

Spaced seeds are important tools for similarity search in bioinformatics, and using several seeds together often significantly improves their performance. With existing approaches, however, for each seed we keep a separate linear-size data…

Data Structures and Algorithms · Computer Science 2014-03-11 Travis Gagie , Giovanni Manzini , Daniel Valenzuela

Packed Compact Tries: A Fast and Efficient Data Structure for Online String Processing

In this paper, we present a new data structure called the packed compact trie (packed c-trie) which stores a set $S$ of $k$ strings of total length $n$ in $n \log\sigma + O(k \log n)$ bits of space and supports fast pattern matching queries…

Data Structures and Algorithms · Computer Science 2017-10-11 Takuya Takagi , Shunsuke Inenaga , Kunihiko Sadakane , Hiroki Arimura

Faster Repetition-Aware Compressed Suffix Trees based on Block Trees

Suffix trees are a fundamental data structure in stringology, but their space usage, though linear, is an important problem for its applications. We design and implement a new compressed suffix tree targeted to highly repetitive texts, such…

Data Structures and Algorithms · Computer Science 2019-02-12 Manuel Cáceres , Gonzalo Navarro

Time and Memory Efficient Lempel-Ziv Compression Using Suffix Arrays

The well-known dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are the basis of several universal lossless compression techniques. These algorithms are asymmetric regarding encoding/decoding time and memory requirements, with…

Data Structures and Algorithms · Computer Science 2009-12-31 Artur Ferreira , Arlindo Oliveira , Mario Figueiredo

Fully dynamic data structure for LCE queries in compressed space

A Longest Common Extension (LCE) query on a text $T$ of length $N$ asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding $\mathcal{G}$ of size $w = O(\min(z \log N…

Data Structures and Algorithms · Computer Science 2016-06-28 Takaaki Nishimoto , Tomohiro I , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Improved Compressed String Dictionaries

We introduce a new family of compressed data structures to efficiently store and query large string dictionaries in main memory. Our main technique is a combination of hierarchical Front-coding with ideas from longest-common-prefix…

Data Structures and Algorithms · Computer Science 2019-11-20 Nieves R. Brisaboa , Ana Cerdeira-Pena , Guillermo de Bernardo , Gonzalo Navarro