English

A Formal Category Theoretical Framework for Multi-model Data Transformations

Databases 2022-01-14 v1

Abstract

Data integration and migration processes in polystores and multi-model database management systems highly benefit from data and schema transformations. Rigorous modeling of transformations is a complex problem. The data and schema transformation field is scattered with multiple different transformation frameworks, tools, and mappings. These are usually domain-specific and lack solid theoretical foundations. Our first goal is to define category theoretical foundations for relational, graph, and hierarchical data models and instances. Each data instance is represented as a category theoretical mapping called a functor. We formalize data and schema transformations as Kan lifts utilizing the functorial representation for the instances. A Kan lift is a category theoretical construction consisting of two mappings satisfying a certain universal property. In this work, the two mappings correspond to schema transformation and data transformation.

Keywords

Cite

@article{arxiv.2201.04905,
  title  = {A Formal Category Theoretical Framework for Multi-model Data Transformations},
  author = {Valter Uotila and Jiaheng Lu},
  journal= {arXiv preprint arXiv:2201.04905},
  year   = {2022}
}

Comments

15 pages, 4 figures, Heterogeneous Data Management, Polystores, and Analytics for Healthcare, VLDB Workshops, Poly 2021 and DMAH 2021