Related papers: Tree Compression with Top Trees
We revisit tree compression with top trees (Bille et al, ICALP'13) and present several improvements to the compressor and its analysis. By significantly reducing the amount of information stored and guiding the compression step using a…
We consider the top tree compression scheme introduced by Bille et al. [ICALP 2013] and construct an infinite family of trees on $n$ nodes labeled from an alphabet of size $\sigma$, for which the size of the top DAG is…
We present an algorithm for computing a maximum agreement subtree of two unrooted evolutionary trees. It takes O(n^{1.5} log n) time for trees with unbounded degrees, matching the best known time complexity for the rooted case. Our…
The class of self-nested trees presents remarkable compression properties because of the systematic repetition of subtrees in their structure. In this paper, we provide a better combinatorial characterization of this specific family of…
A DAG compression of a (typically dense) graph is a simple data structure that stores how vertex clusters are connected, where the clusters are described indirectly as sets of reachable sinks in a directed acyclic graph (DAG). They…
We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…
We address the problem of efficiently gathering correlated data from a wired or a wireless sensor network, with the aim of designing algorithms with provable optimality guarantees, and understanding how close we can get to the known…
In this work we introduce a new linear time compression algorithm, called "Re-pair for Trees", which compresses ranked ordered trees using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free…
The $k^2$-tree is a compact data structure designed to efficiently store sparse binary matrices by leveraging both sparsity and clustering of nonzero elements. This representation supports efficiently navigational operations and complex…
Suffix trees are a fundamental data structure in stringology, but their space usage, though linear, is an important problem for its applications. We design and implement a new compressed suffix tree targeted to highly repetitive texts, such…
It is shown that for a given ordered node-labelled tree of size $n$ and with $s$ many different node labels, one can construct in linear time a top dag of height $O(\log n)$ and size $O(n / \log_\sigma n) \cap O(d \cdot \log n)$, where…
This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture…
Frequent pattern mining is a relevant method to analyse structured data, like sequences, trees or graphs. It consists in identifying characteristic substructures of a dataset. This paper deals with a new type of patterns for tree data:…
A central challenge in scaling up explicit state-space search for large tasks is compactly representing the set of generated states. Tree databases, a data structure from model checking, require constant space per generated state in the…
Tree-based machine learning techniques, such as Decision Trees and Random Forests, are top performers in several domains as they do well with limited training datasets and offer improved interpretability compared to Deep Neural Networks…
Unranked trees can be represented using their minimal dag (directed acyclic graph). For XML this achieves high compression ratios due to their repetitive mark up. Unranked trees are often represented through first child/next sibling (fcns)…
In this report an entropy bound on the memory size is given for a compression method of leaf-labeled trees. The compression converts the tree into a Directed Acyclic Graph (DAG) by merging isomorphic subtrees.
This paper focuses on reducing memory usage in enumerative model checking, while maintaining the multi-core scalability obtained in earlier work. We present a tree-based multi-core compression method, which works by leveraging sharing among…
Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in on-line application settings. We create a sequential tree model whose state changes in time with the…
Can we do arithmetic in a completely different way, with a radically different data structure? Could this approach provide practical benefits, like operations on giant numbers while having an average performance similar to traditional…