English
Related papers

Related papers: Practical and Effective Re-Pair Compression

200 papers

Re-Pair is an effective grammar-based compression scheme achieving strong compression rates in practice. Let $n$, $\sigma$, and $d$ be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original…

Data Structures and Algorithms · Computer Science 2016-11-07 Philip Bille , Inge Li Gørtz , Nicola Prezza

Re-Pair is a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large scale data sets. As a…

Data Structures and Algorithms · Computer Science 2019-11-19 Dominik Köppl , Tomohiro I , Isamu Furuya , Yoshimasa Takabatake , Kensuke Sakai , Keisuke Goto

Given a string $T$ of length $N$, the goal of grammar compression is to construct a small context-free grammar generating only $T$. Among existing grammar compression methods, RePair (recursive paring) [Larsson and Moffat, 1999] is notable…

Data Structures and Algorithms · Computer Science 2018-11-06 Kensuke Sakai , Tatsuya Ohno , Keisuke Goto , Yoshimasa Takabatake , Tomohiro I , Hiroshi Sakamoto

The goal of grammar compression is to construct a small sized context free grammar which uniquely generates the input text data. Among grammar compression methods, RePair is known for its good practical compression performance. MR-RePair…

Data Structures and Algorithms · Computer Science 2019-10-31 Isamu Furuya

We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the…

Data Structures and Algorithms · Computer Science 2019-02-19 Isamu Furuya , Takuya Takagi , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Takuya Kida

Data compression is a powerful tool for managing massive but repetitive datasets, especially schemes such as grammar-based compression that support computation over the data without decompressing it. In the best case such a scheme takes a…

Data Structures and Algorithms · Computer Science 2019-06-04 Travis Gagie , Tomohiro I , Giovanni Manzini , Gonzalo Navarro , Hiroshi Sakamoto , Yoshimasa Takabatake

Grammar-based compression is a loss-less data compression scheme that represents a given string $w$ by a context-free grammar that generates only $w$. While computing the smallest grammar which generates a given string $w$ is NP-hard in…

Combinatorics · Mathematics 2022-04-18 Takuya Mieno , Shunsuke Inenaga , Takashi Horiyama

Compression of inverted lists with methods that support fast intersection operations is an active research topic. Most compression schemes rely on encoding differences between consecutive positions with techniques that favor small numbers.…

Information Retrieval · Computer Science 2009-11-18 Francisco Claude , Antonio Farina , Gonzalo Navarro

The compression is an important topic in computer science which allows we to storage more amount of data on our data storage. There are several techniques to compress any file. In this manuscript will be described the most important…

Multimedia · Computer Science 2019-02-14 Pasquale De Luca , Vincenzo Maria Russiello , Raffaele Ciro Sannino , Lorenzo Valente

Grammar compression is a general compression framework in which a string $T$ of length $N$ is represented as a context-free grammar of size $n$ whose language contains only $T$. In this paper, we focus on studying the limitations of…

Data Structures and Algorithms · Computer Science 2024-09-24 Rajat De , Dominik Kempa

Grammar compression represents a string as a context free grammar. Achieving compression requires encoding such grammar as a binary string; there are a few commonly used encodings. We bound the size of practically used encodings for several…

Data Structures and Algorithms · Computer Science 2020-05-21 Michał Gańczorz

We present OnPair, a dictionary-based compression algorithm designed to meet the needs of in-memory database systems that require both high compression and fast random access. Existing methods either achieve strong compression ratios at…

Databases · Computer Science 2025-08-05 Francesco Gargiulo , Rossano Venturini

Various grammar compression algorithms have been proposed in the last decade. A grammar compression is a restricted CFG deriving the string deterministically. An efficient grammar compression develops a smaller CFG by finding duplicated…

Data Structures and Algorithms · Computer Science 2016-09-01 Shouhei Fukunaga , Yoshimasa Takabatake , I Tomohiro , Hiroshi Sakamoto

In this paper we present an application of a simple technique of local recompression, previously developed by the author in the context of compressed membership problems and compressed pattern matching, to word equations. The technique is…

Formal Languages and Automata Theory · Computer Science 2014-03-19 Artur Jeż

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern…

Data Structures and Algorithms · Computer Science 2013-06-26 Artur Jeż

We present a new graph compressor that works by recursively detecting repeated substructures and representing them through grammar rules. We show that for a large number of graphs the compressor obtains smaller representations than other…

Data Structures and Algorithms · Computer Science 2017-04-19 Sebastian Maneth , Fabian Peternek

In this paper we present a simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string.…

Data Structures and Algorithms · Computer Science 2013-11-08 Artur Jeż

Compressed indexing is a powerful technique that enables efficient querying over data stored in compressed form, significantly reducing memory usage and often accelerating computation. While extensive progress has been made for…

Data Structures and Algorithms · Computer Science 2025-10-23 Rajat De , Dominik Kempa

In this work we introduce a new linear time compression algorithm, called "Re-pair for Trees", which compresses ranked ordered trees using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free…

Data Structures and Algorithms · Computer Science 2010-08-02 Markus Lohrey , Sebastian Maneth , Roy Mennicke

The most fundamental problem considered in algorithms for text processing is pattern matching: given a pattern $p$ of length $m$ and a text $t$ of length $n$, does $p$ occur in $t$? Multiple versions of this basic question have been…

Data Structures and Algorithms · Computer Science 2021-11-10 Moses Ganardi , Paweł Gawrychowski
‹ Prev 1 2 3 10 Next ›