Related papers: Space-Efficient Re-Pair Compression

Practical and Effective Re-Pair Compression

Re-Pair is an efficient grammar compressor that operates by recursively replacing high-frequency character pairs with new grammar symbols. The most space-efficient linear-time algorithm computing Re-Pair uses $(1+\epsilon)n+\sqrt n$ words…

Data Structures and Algorithms · Computer Science 2017-04-28 Philip Bille , Inge Li Gørtz , Nicola Prezza

Re-Pair In Small Space

Re-Pair is a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large scale data sets. As a…

Data Structures and Algorithms · Computer Science 2019-11-19 Dominik Köppl , Tomohiro I , Isamu Furuya , Yoshimasa Takabatake , Kensuke Sakai , Keisuke Goto

RePair in Compressed Space and Time

Given a string $T$ of length $N$, the goal of grammar compression is to construct a small context-free grammar generating only $T$. Among existing grammar compression methods, RePair (recursive paring) [Larsson and Moffat, 1999] is notable…

Data Structures and Algorithms · Computer Science 2018-11-06 Kensuke Sakai , Tatsuya Ohno , Keisuke Goto , Yoshimasa Takabatake , Tomohiro I , Hiroshi Sakamoto

Practical Repetition-Aware Grammar Compression

The goal of grammar compression is to construct a small sized context free grammar which uniquely generates the input text data. Among grammar compression methods, RePair is known for its good practical compression performance. MR-RePair…

Data Structures and Algorithms · Computer Science 2019-10-31 Isamu Furuya

MR-RePair: Grammar Compression based on Maximal Repeats

We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the…

Data Structures and Algorithms · Computer Science 2019-02-19 Isamu Furuya , Takuya Takagi , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Takuya Kida

A study for Image compression using Re-Pair algorithm

The compression is an important topic in computer science which allows we to storage more amount of data on our data storage. There are several techniques to compress any file. In this manuscript will be described the most important…

Multimedia · Computer Science 2019-02-14 Pasquale De Luca , Vincenzo Maria Russiello , Raffaele Ciro Sannino , Lorenzo Valente

Re-Pair Compression of Inverted Lists

Compression of inverted lists with methods that support fast intersection operations is an active research topic. Most compression schemes rely on encoding differences between consecutive positions with techniques that favor small numbers.…

Information Retrieval · Computer Science 2009-11-18 Francisco Claude , Antonio Farina , Gonzalo Navarro

Rpair: Rescaling RePair with Rsync

Data compression is a powerful tool for managing massive but repetitive datasets, especially schemes such as grammar-based compression that support computation over the data without decompressing it. In the best case such a scheme takes a…

Data Structures and Algorithms · Computer Science 2019-06-04 Travis Gagie , Tomohiro I , Giovanni Manzini , Gonzalo Navarro , Hiroshi Sakamoto , Yoshimasa Takabatake

Grammar Boosting: A New Technique for Proving Lower Bounds for Computation over Compressed Data

Grammar compression is a general compression framework in which a string $T$ of length $N$ is represented as a context-free grammar of size $n$ whose language contains only $T$. In this paper, we focus on studying the limitations of…

Data Structures and Algorithms · Computer Science 2024-09-24 Rajat De , Dominik Kempa

RePair Grammars are the Smallest Grammars for Fibonacci Words

Grammar-based compression is a loss-less data compression scheme that represents a given string $w$ by a context-free grammar that generates only $w$. While computing the smallest grammar which generates a given string $w$ is NP-hard in…

Combinatorics · Mathematics 2022-04-18 Takuya Mieno , Shunsuke Inenaga , Takashi Horiyama

Recompression: a simple and powerful technique for word equations

In this paper we present an application of a simple technique of local recompression, previously developed by the author in the context of compressed membership problems and compressed pattern matching, to word equations. The technique is…

Formal Languages and Automata Theory · Computer Science 2014-03-19 Artur Jeż

Online Grammar Compression for Frequent Pattern Discovery

Various grammar compression algorithms have been proposed in the last decade. A grammar compression is a restricted CFG deriving the string deterministically. An efficient grammar compression develops a smaller CFG by finding duplicated…

Data Structures and Algorithms · Computer Science 2016-09-01 Shouhei Fukunaga , Yoshimasa Takabatake , I Tomohiro , Hiroshi Sakamoto

OnPair: Short Strings Compression for Fast Random Access

We present OnPair, a dictionary-based compression algorithm designed to meet the needs of in-memory database systems that require both high compression and fast random access. Existing methods either achieve strong compression ratios at…

Databases · Computer Science 2025-08-05 Francesco Gargiulo , Rossano Venturini

Entropy bounds for grammar compression

Grammar compression represents a string as a context free grammar. Achieving compression requires encoding such grammar as a binary string; there are a few commonly used encodings. We bound the size of practically used encodings for several…

Data Structures and Algorithms · Computer Science 2020-05-21 Michał Gańczorz

Approximation of grammar-based compression via recompression

In this paper we present a simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string.…

Data Structures and Algorithms · Computer Science 2013-11-08 Artur Jeż

Tree structure compression with RePair

In this work we introduce a new linear time compression algorithm, called "Re-pair for Trees", which compresses ranked ordered trees using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free…

Data Structures and Algorithms · Computer Science 2010-08-02 Markus Lohrey , Sebastian Maneth , Roy Mennicke

Faster fully compressed pattern matching by recompression

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern…

Data Structures and Algorithms · Computer Science 2013-06-26 Artur Jeż

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processing is pattern matching: given a pattern $p$ of length $m$ and a text $t$ of length $n$, does $p$ occur in $t$? Multiple versions of this basic question have been…

Data Structures and Algorithms · Computer Science 2021-11-10 Moses Ganardi , Paweł Gawrychowski

Compressed Subsequence Matching and Packed Tree Coloring

We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size $n$ compressing a string of size $N$ and a pattern string of size $m$ over an alphabet of size $\sigma$, our algorithm uses…

Data Structures and Algorithms · Computer Science 2014-06-06 Philip Bille , Patrick Hagge Cording , Inge Li Gørtz

Improved Circular Dictionary Matching

The circular dictionary matching problem is an extension of the classical dictionary matching problem where every string in the dictionary is interpreted as a circular string: after reading the last character of a string, we can move back…

Data Structures and Algorithms · Computer Science 2025-04-07 Nicola Cotumaccio