English
Related papers

Related papers: Tree compression using string grammars

200 papers

We introduce forest straight-line programs (FSLPs) as a compressed representation of unranked ordered node-labelled trees. FSLPs are based on the operations of forest algebra and generalize tree straight-line programs. We compare the…

Data Structures and Algorithms · Computer Science 2018-02-16 Adrià Gascón , Markus Lohrey , Sebastian Maneth , Carl Philipp Reh , Kurt Sieber

In grammar-based compression a string is represented by a context-free grammar, also called a straight-line program (SLP), that generates only that string. We refine a recent balancing result stating that one can transform an SLP of size…

Data Structures and Algorithms · Computer Science 2021-07-02 Moses Ganardi

We study the problem of enumerating the answers to a query formulated in monadic second order logic (MSO) over an unranked forest F that is compressed by a straight-line program (SLP) D. Our main result states that this can be done after…

Formal Languages and Automata Theory · Computer Science 2026-03-17 Markus Lohrey , Markus L. Schmid

In this work we introduce a new linear time compression algorithm, called "Re-pair for Trees", which compresses ranked ordered trees using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free…

Data Structures and Algorithms · Computer Science 2010-08-02 Markus Lohrey , Sebastian Maneth , Roy Mennicke

Extractive compression is a challenging natural language processing problem. This work contributes by formulating neural extractive compression as a parse tree transduction problem, rather than a sequence transduction task. Motivated by…

Information Retrieval · Computer Science 2018-09-26 Davide Bacciu , Antonio Bruno

A grammar-compressed ranked tree is represented with a linear space overhead so that a single traversal step, i.e., the move to the parent or the i-th child, can be carried out in constant time. Moreover, we extend our data structure such…

Data Structures and Algorithms · Computer Science 2015-11-11 Markus Lohrey , Sebastian Maneth , Carl Philipp Reh

Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing…

Data Structures and Algorithms · Computer Science 2007-05-23 Yury Lifshits

We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure $\delta$ based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are…

Data Structures and Algorithms · Computer Science 2024-02-16 Gonzalo Navarro , Cristian Urbina

We present an algorithm for computing the Lyndon factorization of a string that is given in grammar compressed form, namely, a Straight Line Program (SLP). The algorithm runs in $O(n^4 + mn^3h)$ time and $O(n^2)$ space, where $m$ is the…

Data Structures and Algorithms · Computer Science 2013-04-29 Tomohiro I , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda

We consider the problem of evaluating regular spanners over compressed documents, i.e., we wish to solve evaluation tasks directly on the compressed data, without decompression. As compressed forms of the documents we use straight-line…

Data Structures and Algorithms · Computer Science 2021-01-27 Markus L. Schmid , Nicole Schweikardt

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammars generating exactly one string; the term fully means that both the pattern…

Data Structures and Algorithms · Computer Science 2013-06-26 Artur Jeż

Random access to highly compressed strings -- represented by straight-line programs or Lempel-Ziv parses, for example -- is a well-studied topic. Random access to such strings in strongly sublogarithmic time is impossible in the worst case,…

Data Structures and Algorithms · Computer Science 2026-02-05 Ferdinando Cicalese , Zsuzsanna Lipták , Travis Gagie , Gonzalo Navarro , Nicola Prezza , Cristian Urbina

Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to…

Data Structures and Algorithms · Computer Science 2011-11-10 Alexander Tiskin

Efficient methods for storing and querying are critical for scaling high-order n-gram language models to large corpora. We propose a language model based on compressed suffix trees, a representation that is highly compact and can be easily…

Computation and Language · Computer Science 2016-08-17 Ehsan Shareghi , Matthias Petri , Gholamreza Haffari , Trevor Cohn

LRM-Trees are an elegant way to partition a sequence of values into sorted consecutive blocks, and to express the relative position of the first element of each block within a previous block. They were used to encode ordinal trees and to…

Data Structures and Algorithms · Computer Science 2010-09-30 Jérémy Barbay , Johannes Fischer

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann

This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture…

Computation and Language · Computer Science 2014-01-23 Trevor Anthony Cohn , Mirella Lapata

Offering rich contexts to Large Language Models (LLMs) has shown to boost the performance in various tasks, but the resulting longer prompt would increase the computational cost and might exceed the input limit of LLMs. Recently, some…

Computation and Language · Computer Science 2025-09-30 Wenhao Mao , Chengbin Hou , Tianyu Zhang , Xinyu Lin , Ke Tang , Hairong Lv

We present an efficient algorithm for calculating $q$-gram frequencies on strings represented in compressed form, namely, as a straight line program (SLP). Given an SLP $\mathcal{T}$ of size $n$ that represents string $T$, the algorithm…

Data Structures and Algorithms · Computer Science 2013-05-27 Keisuke Goto , Hideo Bannai , Shunsuke Inenaga , Masayuki Takeda

We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data structure of $O(g)$ words of space that can be built from a…

Data Structures and Algorithms · Computer Science 2020-04-28 Kazuya Tsuruta , Dominik Köppl , Yuto Nakashima , Shunsuke Inenaga , Hideo Bannai , Masayuki Takeda
‹ Prev 1 2 3 10 Next ›