Partout: A Distributed Engine for Efficient RDF Processing

Luis Galárraga; Katja Hose; Ralf Schenkel

Partout: A Distributed Engine for Efficient RDF Processing

Databases 2012-12-27 v1

Authors: Luis Galárraga , Katja Hose , Ralf Schenkel

Abstract

The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art approaches for partitioning and distributed SPARQL query processing.

Keywords

distributed computing data processing concurrent algorithm

Cite

@article{arxiv.1212.5636,
  title  = {Partout: A Distributed Engine for Efficient RDF Processing},
  author = {Luis Galárraga and Katja Hose and Ralf Schenkel},
  journal= {arXiv preprint arXiv:1212.5636},
  year   = {2012}
}

Partout: A Distributed Engine for Efficient RDF Processing

Abstract

Keywords

Cite

Related papers