Related papers: Faster fully compressed pattern matching by recomp…

Solving Classical String Problems on Compressed Texts

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing…

Data Structures and Algorithms · Computer Science 2007-05-23 Yury Lifshits

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processing is pattern matching: given a pattern $p$ of length $m$ and a text $t$ of length $n$, does $p$ occur in $t$? Multiple versions of this basic question have been…

Data Structures and Algorithms · Computer Science 2021-11-10 Moses Ganardi , Paweł Gawrychowski

Faster subsequence recognition in compressed strings

Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to…

Data Structures and Algorithms · Computer Science 2011-11-10 Alexander Tiskin

Compressed Dictionary Matching on Run-Length Encoded Strings

Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…

Data Structures and Algorithms · Computer Science 2025-09-04 Philip Bille , Inge Li Gørtz , Simon J. Puglisi , Simon R. Tarnow

Computing convolution on grammar-compressed text

The convolution between a text string $S$ of length $N$ and a pattern string $P$ of length $m$ can be computed in $O(N \log m)$ time by FFT. It is known that various types of approximate string matching problems are reducible to…

Data Structures and Algorithms · Computer Science 2013-03-19 Toshiya Tanaka , Tomohiro I , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Compression by Contracting Straight-Line Programs

In grammar-based compression a string is represented by a context-free grammar, also called a straight-line program (SLP), that generates only that string. We refine a recent balancing result stating that one can transform an SLP of size…

Data Structures and Algorithms · Computer Science 2021-07-02 Moses Ganardi

Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression…

Data Structures and Algorithms · Computer Science 2011-04-22 Pawel Gawrychowski

Efficient Lyndon factorization of grammar compressed text

We present an algorithm for computing the Lyndon factorization of a string that is given in grammar compressed form, namely, a Straight Line Program (SLP). The algorithm runs in $O(n^4 + mn^3h)$ time and $O(n^2)$ space, where $m$ is the…

Data Structures and Algorithms · Computer Science 2013-04-29 Tomohiro I , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Compressed Indexing with Signature Grammars

The compressed indexing problem is to preprocess a string $S$ of length $n$ into a compressed representation that supports pattern matching queries. That is, given a string $P$ of length $m$ report all occurrences of $P$ in $S$. We present…

Data Structures and Algorithms · Computer Science 2018-04-12 Anders Roy Christiansen , Mikko Berggren Ettienne

Linear Approximate Pattern Matching Algorithm

Pattern matching is a fundamental process in almost every scientific domain. The problem involves finding the positions of a given pattern (usually of short length) in a reference stream of data (usually of large length). The matching can…

Data Structures and Algorithms · Computer Science 2022-07-01 Anas Al-okaily , Abdelghani Tbakhi

Online Grammar Compression for Frequent Pattern Discovery

Various grammar compression algorithms have been proposed in the last decade. A grammar compression is a restricted CFG deriving the string deterministically. An efficient grammar compression develops a smaller CFG by finding duplicated…

Data Structures and Algorithms · Computer Science 2016-09-01 Shouhei Fukunaga , Yoshimasa Takabatake , I Tomohiro , Hiroshi Sakamoto

Tying up the loose ends in fully LZW-compressed pattern matching

We consider a natural generalization of the classical pattern matching problem: given compressed representations of a pattern p[1..M] and a text t[1..N] of sizes m and n, respectively, does p occur in t? We develop an optimal linear time…

Data Structures and Algorithms · Computer Science 2011-09-20 Pawel Gawrychowski

Recompression: a simple and powerful technique for word equations

In this paper we present an application of a simple technique of local recompression, previously developed by the author in the context of compressed membership problems and compressed pattern matching, to word equations. The technique is…

Formal Languages and Automata Theory · Computer Science 2014-03-19 Artur Jeż

Efficient LZ78 factorization of grammar compressed text

We present an efficient algorithm for computing the LZ78 factorization of a text, where the text is represented as a straight line program (SLP), which is a context free grammar in the Chomsky normal form that generates a single string.…

Data Structures and Algorithms · Computer Science 2013-05-27 Hideo Bannai , Shunsuke Inenaga , Masayuki Takeda

Compressed Subsequence Matching and Packed Tree Coloring

We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size $n$ compressing a string of size $N$ and a pattern string of size $m$ over an alphabet of size $\sigma$, our algorithm uses…

Data Structures and Algorithms · Computer Science 2014-06-06 Philip Bille , Patrick Hagge Cording , Inge Li Gørtz

Approximation of grammar-based compression via recompression

In this paper we present a simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string.…

Data Structures and Algorithms · Computer Science 2013-11-08 Artur Jeż

Logarithmic-Time Internal Pattern Matching Queries in Compressed and Dynamic Texts

Internal Pattern Matching (IPM) queries on a text $T$, given two fragments $X$ and $Y$ of $T$ such that $|Y|<2|X|$, ask to compute all exact occurrences of $X$ within $Y$. IPM queries have been introduced by Kociumaka, Radoszewski, Rytter,…

Data Structures and Algorithms · Computer Science 2025-03-06 Anouk Duyster , Tomasz Kociumaka

RePair in Compressed Space and Time

Given a string $T$ of length $N$, the goal of grammar compression is to construct a small context-free grammar generating only $T$. Among existing grammar compression methods, RePair (recursive paring) [Larsson and Moffat, 1999] is notable…

Data Structures and Algorithms · Computer Science 2018-11-06 Kensuke Sakai , Tatsuya Ohno , Keisuke Goto , Yoshimasa Takabatake , Tomohiro I , Hiroshi Sakamoto

Decompressing Lempel-Ziv Compressed Text

We consider the problem of decompressing the Lempel--Ziv 77 representation of a string $S$ of length $n$ using a working space as close as possible to the size $z$ of the input. The folklore solution for the problem runs in $O(n)$ time but…

Data Structures and Algorithms · Computer Science 2019-11-05 Philip Bille , Mikko Berggren Ettienne , Travis Gagie , Inge Li Gørtz , Nicola Prezza

Faster Approximate Pattern Matching in Compressed Repetitive Texts

Motivated by the imminent growth of massive, highly redundant genomic databases, we study the problem of compressing a string database while simultaneously supporting fast random access, substring extraction and pattern matching to the…

Data Structures and Algorithms · Computer Science 2012-11-01 Travis Gagie , Paweł Gawrychowski , Christopher Hoobin , Simon J. Puglisi