Related papers: Faster subsequence recognition in compressed strin…

Faster fully compressed pattern matching by recompression

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern…

Data Structures and Algorithms · Computer Science 2013-06-26 Artur Jeż

Solving Classical String Problems on Compressed Texts

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing…

Data Structures and Algorithms · Computer Science 2007-05-23 Yury Lifshits

Incongruity-sensitive access to highly compressed strings

Random access to highly compressed strings -- represented by straight-line programs or Lempel-Ziv parses, for example -- is a well-studied topic. Random access to such strings in strongly sublogarithmic time is impossible in the worst case,…

Data Structures and Algorithms · Computer Science 2026-02-05 Ferdinando Cicalese , Zsuzsanna Lipták , Travis Gagie , Gonzalo Navarro , Nicola Prezza , Cristian Urbina

Sublinear Algorithms for Approximating String Compressibility

We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE)…

Data Structures and Algorithms · Computer Science 2007-06-11 Sofya Raskhodnikova , Dana Ron , Ronitt Rubinfeld , Adam Smith

Computing convolution on grammar-compressed text

The convolution between a text string $S$ of length $N$ and a pattern string $P$ of length $m$ can be computed in $O(N \log m)$ time by FFT. It is known that various types of approximate string matching problems are reducible to…

Data Structures and Algorithms · Computer Science 2013-03-19 Toshiya Tanaka , Tomohiro I , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Fingerprints in Compressed Strings

The Karp-Rabin fingerprint of a string is a type of hash value that due to its strong properties has been used in many string algorithms. In this paper we show how to construct a data structure for a string $S$ of size $N$ compressed by a…

Data Structures and Algorithms · Computer Science 2013-05-17 Philip Bille , Patrick Hagge Cording , Inge Li Gørtz , Benjamin Sach , Hjalte Wedel Vildhøj , Søren Vind

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processing is pattern matching: given a pattern $p$ of length $m$ and a text $t$ of length $n$, does $p$ occur in $t$? Multiple versions of this basic question have been…

Data Structures and Algorithms · Computer Science 2021-11-10 Moses Ganardi , Paweł Gawrychowski

Longest Square Subsequence Problem Revisited

The longest square subsequence (LSS) problem consists of computing a longest subsequence of a given string $S$ that is a square, i.e., a longest subsequence of form $XX$ appearing in $S$. It is known that an LSS of a string $S$ of length…

Data Structures and Algorithms · Computer Science 2020-07-30 Takafumi Inoue , Shunsuke Inenaga , Hideo Bannai

Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression…

Data Structures and Algorithms · Computer Science 2011-04-22 Pawel Gawrychowski

Decompressing Lempel-Ziv Compressed Text

We consider the problem of decompressing the Lempel--Ziv 77 representation of a string $S$ of length $n$ using a working space as close as possible to the size $z$ of the input. The folklore solution for the problem runs in $O(n)$ time but…

Data Structures and Algorithms · Computer Science 2019-11-05 Philip Bille , Mikko Berggren Ettienne , Travis Gagie , Inge Li Gørtz , Nicola Prezza

Compression by Contracting Straight-Line Programs

In grammar-based compression a string is represented by a context-free grammar, also called a straight-line program (SLP), that generates only that string. We refine a recent balancing result stating that one can transform an SLP of size…

Data Structures and Algorithms · Computer Science 2021-07-02 Moses Ganardi

Efficient Lyndon factorization of grammar compressed text

We present an algorithm for computing the Lyndon factorization of a string that is given in grammar compressed form, namely, a Straight Line Program (SLP). The algorithm runs in $O(n^4 + mn^3h)$ time and $O(n^2)$ space, where $m$ is the…

Data Structures and Algorithms · Computer Science 2013-04-29 Tomohiro I , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

Data Race Detection on Compressed Traces

We consider the problem of detecting data races in program traces that have been compressed using straight line programs (SLP), which are special context-free grammars that generate exactly one string, namely the trace that they represent.…

Programming Languages · Computer Science 2018-12-19 Dileep Kini , Umang Mathur , Mahesh Viswanathan

String Indexing with Compressed Patterns

Given a string $S$ of length $n$, the classic string indexing problem is to preprocess $S$ into a compact data structure that supports efficient subsequent pattern queries. In this paper we consider the basic variant where the pattern is…

Data Structures and Algorithms · Computer Science 2024-02-15 Philip Bille , Inge Li Gørtz , Teresa Anna Steiner

Spanner Evaluation over SLP-Compressed Documents

We consider the problem of evaluating regular spanners over compressed documents, i.e., we wish to solve evaluation tasks directly on the compressed data, without decompression. As compressed forms of the documents we use straight-line…

Data Structures and Algorithms · Computer Science 2021-01-27 Markus L. Schmid , Nicole Schweikardt

Detecting regularities on grammar-compressed strings

We solve the problems of detecting and counting various forms of regularities in a string represented as a Straight Line Program (SLP). Given an SLP of size $n$ that represents a string $s$ of length $N$, our algorithm compute all runs and…

Data Structures and Algorithms · Computer Science 2013-04-29 Tomohiro I , Wataru Matsubara , Kouji Shimohira , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda , Kazuyuki Narisawa , Ayumi Shinohara

Compressed Dictionary Matching on Run-Length Encoded Strings

Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…

Data Structures and Algorithms · Computer Science 2025-09-04 Philip Bille , Inge Li Gørtz , Simon J. Puglisi , Simon R. Tarnow

Fast $q$-gram Mining on SLP Compressed Strings

We present simple and efficient algorithms for calculating $q$-gram frequencies on strings represented in compressed form, namely, as a straight line program (SLP). Given an SLP of size $n$ that represents string $T$, we present an $O(qn)$…

Data Structures and Algorithms · Computer Science 2011-07-14 Keisuke Goto , Hideo Bannai , Shunsuke Inenaga , Masayuki Takeda

The Longest Subsequence-Repeated Subsequence Problem

Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence $S$ of length $n$, a letter-repeated subsequence is a subsequence…

Data Structures and Algorithms · Computer Science 2023-09-01 Manuel Lafond , Wenfeng Lai , Adiesha Liyanage , Binhai Zhu

Faster Approximate Pattern Matching in Compressed Repetitive Texts

Motivated by the imminent growth of massive, highly redundant genomic databases, we study the problem of compressing a string database while simultaneously supporting fast random access, substring extraction and pattern matching to the…

Data Structures and Algorithms · Computer Science 2012-11-01 Travis Gagie , Paweł Gawrychowski , Christopher Hoobin , Simon J. Puglisi