Related papers: XPath Node Selection over Grammar-Compressed Trees

Fast and Tiny Structural Self-Indexes for XML

XML document markup is highly repetitive and therefore well compressible using dictionary-based methods such as DAGs or grammars. In the context of selectivity estimation, grammar-compressed trees were used before as synopsis for structural…

Databases · Computer Science 2010-12-30 Sebastian Maneth , Tom Sebastian

Fast In-Memory XPath Search over Compressed Text and Tree Indexes

A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can efficiently be implemented using a compressed…

Databases · Computer Science 2011-10-06 A. Arroyuelo , F. Claude , S. Maneth , V. Mäkinen , G. Navarro , K. Nguyen , J. Siren , N. Välimäki

XPath Whole Query Optimization

Previous work reports about SXSI, a fast XPath engine which executes tree automata over compressed XML indexes. Here, reasons are investigated why SXSI is so fast. It is shown that tree automata can be used as a general framework for fine…

Databases · Computer Science 2015-03-13 Sebastian Maneth , Kim Nguyen

Traversing Grammar-Compressed Trees with Constant Delay

A grammar-compressed ranked tree is represented with a linear space overhead so that a single traversal step, i.e., the move to the parent or the i-th child, can be carried out in constant time. Moreover, we extend our data structure such…

Data Structures and Algorithms · Computer Science 2015-11-11 Markus Lohrey , Sebastian Maneth , Carl Philipp Reh

On the Count of Trees

Regular tree grammars and regular path expressions constitute core constructs widely used in programming languages and type systems. Nevertheless, there has been little research so far on frameworks for reasoning about path expressions…

Databases · Computer Science 2010-08-31 Everardo Barcenas , Pierre Geneves , Nabil Layaida , Alan Schmitt

Fixpoint Node Selection Query Languages for Trees

The study of node selection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the…

Databases · Computer Science 2018-11-15 Diego Calvanese , Giuseppe De Giacomo , Maurizio Lenzerini , Moshe Y. Vardi

Grammar-Based Graph Compression

We present a new graph compressor that works by recursively detecting repeated substructures and representing them through grammar rules. We show that for a large number of graphs the compressor obtains smaller representations than other…

Data Structures and Algorithms · Computer Science 2017-04-19 Sebastian Maneth , Fabian Peternek

On the complexity of XPath containment in the presence of disjunction, DTDs, and variables

XPath is a simple language for navigating an XML-tree and returning a set of answer nodes. The focus in this paper is on the complexity of the containment problem for various fragments of XPath. We restrict attention to the most common…

Databases · Computer Science 2017-01-11 Frank Neven , Thomas Schwentick

XTreePath: A generalization of XPath to handle real world structural variation

We discuss a key problem in information extraction which deals with wrapper failures due to changing content templates. A good proportion of wrapper failures are due to HTML templates changing to cause wrappers to become incompatible after…

Information Retrieval · Computer Science 2017-12-29 Joseph Paul Cohen , Wei Ding , Abraham Bagherjeiran

Global Numerical Constraints on Trees

We introduce a logical foundation to reason on tree structures with constraints on the number of node occurrences. Related formalisms are limited to express occurrence constraints on particular tree regions, as for instance the children of…

Logic in Computer Science · Computer Science 2015-07-01 Everardo Bárcenas , Jesús Lavalle

Tree compression using string grammars

We study the compressed representation of a ranked tree by a (string) straight-line program (SLP) for its preorder traversal, and compare it with the well-studied representation by straight-line context free tree grammars (which are also…

Formal Languages and Automata Theory · Computer Science 2015-09-29 Moses Ganardi , Danny Hucke , Markus Lohrey , Eric Noeth

Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees

Efficient methods for storing and querying are critical for scaling high-order n-gram language models to large corpora. We propose a language model based on compressed suffix trees, a representation that is highly compact and can be easily…

Computation and Language · Computer Science 2016-08-17 Ehsan Shareghi , Matthias Petri , Gholamreza Haffari , Trevor Cohn

Optimizing XML querying using type-based document projection

XML data projection (or pruning) is a natural optimization for main memory query engines: given a query Q over a document D, the subtrees of D that are not necessary to evaluate Q are pruned, thus producing a smaller document D'; the query…

Databases · Computer Science 2015-03-19 Véronique Benzaken , Giuseppe Castagna , Dario Colazzo , Kim Nguyen

Optimizing XML Compression

The eXtensible Markup Language (XML) provides a powerful and flexible means of encoding and exchanging data. As it turns out, its main advantage as an encoding format (namely, its requirement that all open and close markup tags are present…

Databases · Computer Science 2015-05-13 Gregory Leighton , Denilson Barbosa

Logics for XML

This thesis describes the theoretical and practical foundations of a system for the static analysis of XML processing languages. The system relies on a fixpoint temporal logic with converse, derived from the mu-calculus, where models are…

Programming Languages · Computer Science 2014-05-27 Pierre Geneves

Efficient XML Keyword Search based on DAG-Compression

In contrast to XML query languages as e.g. XPath which require knowledge on the query language as well as on the document structure, keyword search is open to anybody. As the size of XML sources grows rapidly, the need for efficient search…

Databases · Computer Science 2013-11-27 Stefan Böttcher , Rita Hartel , Jonathan Rabe

Finding Good Itemsets by Packing Data

The problem of selecting small groups of itemsets that represent the data well has recently gained a lot of attention. We approach the problem by searching for the itemsets that compress the data efficiently. As a compression technique we…

Data Structures and Algorithms · Computer Science 2019-02-08 Nikolaj Tatti , Jilles Vreeken

Answering Queries using Views over Probabilistic XML: Complexity and Tractability

We study the complexity of query answering using views in a probabilistic XML setting, identifying large classes of XPath queries -- with child and descendant navigation and predicates -- for which there are efficient (PTime) algorithms. We…

Databases · Computer Science 2012-08-02 Bogdan Cautis , Evgeny Kharlamov

Learning XML Twig Queries

We investigate the problem of learning XML queries, path queries and tree pattern queries, from examples given by the user. A learning algorithm takes on the input a set of XML documents with nodes annotated by the user and returns a query…

Databases · Computer Science 2012-04-24 Sławomir Staworko , Piotr Wieczorek

Top Tree Compression of Tries

We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to…

Data Structures and Algorithms · Computer Science 2019-09-23 Philip Bille , Inge Li Gørtz , Paweł Gawrychowski , Gad M. Landau , Oren Weimann