English

Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources

Databases 2020-10-09 v1

Abstract

Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.

Keywords

Cite

@article{arxiv.1802.10233,
  title  = {Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources},
  author = {Edmon Begoli and Jesús Camacho Rodríguez and Julian Hyde and Michael J. Mior and Daniel Lemire},
  journal= {arXiv preprint arXiv:1802.10233},
  year   = {2020}
}

Comments

SIGMOD'18