English
Related papers

Related papers: Tree structure compression with RePair

200 papers

We revisit tree compression with top trees (Bille et al, ICALP'13) and present several improvements to the compressor and its analysis. By significantly reducing the amount of information stored and guiding the compression step using a…

Data Structures and Algorithms · Computer Science 2015-06-16 Lorenz Hübschle-Schneider , Rajeev Raman

A simple linear-time algorithm for constructing a linear context-free tree grammar of size O(rg + r g log (n/r g))for a given input tree T of size n is presented, where g is the size of a minimal linear context-free tree grammar for T, and…

Data Structures and Algorithms · Computer Science 2018-10-09 Artur Jeż , Markus Lohrey

We study the compressed representation of a ranked tree by a (string) straight-line program (SLP) for its preorder traversal, and compare it with the well-studied representation by straight-line context free tree grammars (which are also…

Formal Languages and Automata Theory · Computer Science 2015-09-29 Moses Ganardi , Danny Hucke , Markus Lohrey , Eric Noeth

Given a string $T$ of length $N$, the goal of grammar compression is to construct a small context-free grammar generating only $T$. Among existing grammar compression methods, RePair (recursive paring) [Larsson and Moffat, 1999] is notable…

Data Structures and Algorithms · Computer Science 2018-11-06 Kensuke Sakai , Tatsuya Ohno , Keisuke Goto , Yoshimasa Takabatake , Tomohiro I , Hiroshi Sakamoto

This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture…

Computation and Language · Computer Science 2014-01-23 Trevor Anthony Cohn , Mirella Lapata

We introduce a new compression scheme for labeled trees based on top trees. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits…

Data Structures and Algorithms · Computer Science 2014-05-13 Philip Bille , Inge Li Goertz , Gad M. Landau , Oren Weimann

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann

Re-Pair is an effective grammar-based compression scheme achieving strong compression rates in practice. Let $n$, $\sigma$, and $d$ be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original…

Data Structures and Algorithms · Computer Science 2016-11-07 Philip Bille , Inge Li Gørtz , Nicola Prezza

We introduce forest straight-line programs (FSLPs) as a compressed representation of unranked ordered node-labelled trees. FSLPs are based on the operations of forest algebra and generalize tree straight-line programs. We compare the…

Data Structures and Algorithms · Computer Science 2018-02-16 Adrià Gascón , Markus Lohrey , Sebastian Maneth , Carl Philipp Reh , Kurt Sieber

Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. In this paper, we present a novel…

Data Structures and Algorithms · Computer Science 2013-10-30 Philip Bille , Gad M. Landau , Rajeev Raman , Kunihiko Sadakane , Srinivasa Rao Satti , Oren Weimann

We consider the problem of {\em restructuring} compressed texts without explicit decompression. We present algorithms which allow conversions from compressed representations of a string $T$ produced by any grammar-based compression…

Data Structures and Algorithms · Computer Science 2011-07-15 Keisuke Goto , Shirou Maruyama , Shunsuke Inenaga , Hideo Bannai , Hiroshi Sakamoto , Masayuki Takeda

Data compression is a powerful tool for managing massive but repetitive datasets, especially schemes such as grammar-based compression that support computation over the data without decompressing it. In the best case such a scheme takes a…

Data Structures and Algorithms · Computer Science 2019-06-04 Travis Gagie , Tomohiro I , Giovanni Manzini , Gonzalo Navarro , Hiroshi Sakamoto , Yoshimasa Takabatake

Grammar compression is a general compression framework in which a string $T$ of length $N$ is represented as a context-free grammar of size $n$ whose language contains only $T$. In this paper, we focus on studying the limitations of…

Data Structures and Algorithms · Computer Science 2024-09-24 Rajat De , Dominik Kempa

Measuring the complexity of tree structures can be beneficial in areas that use tree data structures for storage, communication, and processing purposes. This complexity can then be used to compress tree data structures to their…

Information Theory · Computer Science 2023-09-19 Amirmohammad Farzaneh , Mihai-Alin Badiu , Justin P. Coon

Tree kernels have been proposed to be used in many areas as the automatic learning of natural language applications. In this paper, we propose a new linear time algorithm based on the concept of weighted tree automata for SubTree kernel…

Computation and Language · Computer Science 2023-02-03 Ludovic Mignot , Faissal Ouardi , Djelloul Ziadi

The goal of grammar compression is to construct a small sized context free grammar which uniquely generates the input text data. Among grammar compression methods, RePair is known for its good practical compression performance. MR-RePair…

Data Structures and Algorithms · Computer Science 2019-10-31 Isamu Furuya

We present algorithms that run in linear time on pointer machines for a collection of problems, each of which either directly or indirectly requires the evaluation of a function defined on paths in a tree. These problems previously had…

Data Structures and Algorithms · Computer Science 2007-05-23 Adam L. Buchsbaum , Loukas Georgiadis , Haim Kaplan , Anne Rogers , Robert E. Tarjan , Jeffery R. Westbrook

Binary jumbled pattern matching asks to preprocess a binary string $S$ in order to answer queries $(i,j)$ which ask for a substring of $S$ that is of length $i$ and has exactly $j$ 1-bits. This problem naturally generalizes to…

Data Structures and Algorithms · Computer Science 2014-07-01 Travis Gagie , Danny Hermelin , Gad M. Landau , Oren Weimann

This paper is an extended abstract of an analysis of term rewriting where the terms in the rewrite rules as well as the term to be rewritten are compressed by a singleton tree grammar (STG). This form of compression is more general than…

Logic in Computer Science · Computer Science 2013-02-27 Manfred Schmidt-Schauss

Re-Pair is an efficient grammar compressor that operates by recursively replacing high-frequency character pairs with new grammar symbols. The most space-efficient linear-time algorithm computing Re-Pair uses $(1+\epsilon)n+\sqrt n$ words…

Data Structures and Algorithms · Computer Science 2017-04-28 Philip Bille , Inge Li Gørtz , Nicola Prezza
‹ Prev 1 2 3 10 Next ›