Related papers: Compressed Data Structures for Dynamic Sequences
In the range $\alpha$-majority query problem, we are given a sequence $S[1..n]$ and a fixed threshold $\alpha \in (0, 1)$, and are asked to preprocess $S$ such that, given a query range $[i..j]$, we can efficiently report the symbols that…
We describe a data structure that supports access, rank and select queries, as well as symbol insertions and deletions, on a string $S[1,n]$ over alphabet $[1..\sigma]$ in time $O(\lg n/\lg\lg n)$, which is optimal even on binary sequences…
Given a string $S$ of length $N$ on a fixed alphabet of $\sigma$ symbols, a grammar compressor produces a context-free grammar $G$ of size $n$ that generates $S$ and only $S$. In this paper we describe data structures to support the…
We present a data structure that stores a sequence $s[1..n]$ over alphabet $[1..\sigma]$ in $n\Ho(s) + o(n)(\Ho(s){+}1)$ bits, where $\Ho(s)$ is the zero-order entropy of $s$. This structure supports the queries \access, \rank\ and \select,…
Big data, encompassing extensive datasets, has seen rapid expansion, notably with a considerable portion being textual data, including strings and texts. Simple compression methods and standard data structures prove inadequate for…
The rank and select operations over a string of length n from an alphabet of size $\sigma$ have been used widely in the design of succinct data structures. In many applications, the string itself need be maintained dynamically, allowing…
Given a string of length $n$ that is composed of $r$ runs of letters from the alphabet $\{0,1,\ldots,\sigma{-}1\}$ such that $2 \le \sigma \le r$, we describe a data structure that, provided $r \le n / \log^{\omega(1)} n$, stores the string…
The Suffix Array $SA_S[1\ldots n]$ of an $n$-length string $S$ is a lexicographically sorted array of the suffixes of $S$. The suffix array is one of the most well known and widely used data structures in string algorithms. We present a…
The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…
A compressed self-index stores a string in compressed form while supporting locate queries without decompression. For highly repetitive strings (arising in web crawls, versioned documents, and genomic collections), static self-indexes can…
In the last decades, the necessity to process massive amounts of textual data fueled the development of compressed text indexes: data structures efficiently answering queries on a given text while occupying space proportional to the…
Given $d$ strings over the alphabet $\{0,1,\ldots,\sigma{-}1\}$, the classical Aho--Corasick data structure allows us to find all $occ$ occurrences of the strings in any text $T$ in $O(|T| + occ)$ time using $O(m\log m)$ bits of space,…
Given a static reference string $R$ and a source string $S$, a relative compression of $S$ with respect to $R$ is an encoding of $S$ as a sequence of references to substrings of $R$. Relative compression schemes are a classic model of…
Suppose that we are given a string $s$ of length $n$ over an alphabet $\{0,1,\ldots,n^{O(1)}\}$ and $\delta$ is the string complexity of $s$, a known compression measure. We describe an index on $s$ with $O(\delta\log\frac{n}{\delta})$…
Given a string $S$ of $n$ integers in $[0,\sigma)$, a range minimum query RMQ$(i, j)$ asks for the index of the smallest integer in $S[i \dots j]$. It is well known that the problem can be solved with a succinct data structure of size $2n +…
In this paper we study the fundamental problem of maintaining a dynamic collection of strings under the following operations: concat - concatenates two strings, split - splits a string into two at a given position, compare - finds the…
In the dynamic indexing problem, we must maintain a changing collection of text documents so that we can efficiently support insertions, deletions, and pattern matching queries. We are especially interested in developing efficient data…
We revisit classic string problems considered in the area of parameterized complexity, and study them through the lens of dynamic data structures. That is, instead of asking for a static algorithm that solves the given instance efficiently,…
The random access problem for compressed strings is to build a data structure that efficiently supports accessing the character in position $i$ of a string given in compressed form. Given a grammar of size $n$ compressing a string of size…
Given a string $S$ of length $n$, the classic string indexing problem is to preprocess $S$ into a compact data structure that supports efficient subsequent pattern queries. In this paper we consider the basic variant where the pattern is…