English

FC-Datalog as a Framework for Efficient String Querying

Logic in Computer Science 2025-01-20 v1 Databases Formal Languages and Automata Theory

Abstract

Core spanners are a class of document spanners that capture the core functionality of IBM's AQL. FC is a logic on strings built around word equations that when extended with constraints for regular languages can be seen as a logic for core spanners. The recently introduced FC-Datalog extends FC with recursion, which allows us to define recursive relations for core spanners. Additionally, as FC-Datalog captures P, it is also a tractable version of Datalog on strings. This presents an opportunity for optimization. We propose a series of FC-Datalog fragments with desirable properties in terms of complexity of model checking, expressive power, and efficiency of checking membership in the fragment. This leads to a range of fragments that all capture LOGSPACE, which we further restrict to obtain linear combined complexity. This gives us a framework to tailor fragments for particular applications. To showcase this, we simulate deterministic regex in a tailored fragment of FC-Datalog.

Keywords

Cite

@article{arxiv.2501.10344,
  title  = {FC-Datalog as a Framework for Efficient String Querying},
  author = {Owen M. Bell and Joel D. Day and Dominik D. Freydenberger},
  journal= {arXiv preprint arXiv:2501.10344},
  year   = {2025}
}