Core spanners are a class of document spanners that capture the core functionality of IBM's AQL. FC is a logic on strings built around word equations that when extended with constraints for regular languages can be seen as a logic for core spanners. The recently introduced FC-Datalog extends FC with recursion, which allows us to define recursive relations for core spanners. Additionally, as FC-Datalog captures P, it is also a tractable version of Datalog on strings. This presents an opportunity for optimization. We propose a series of FC-Datalog fragments with desirable properties in terms of complexity of model checking, expressive power, and efficiency of checking membership in the fragment. This leads to a range of fragments that all capture LOGSPACE, which we further restrict to obtain linear combined complexity. This gives us a framework to tailor fragments for particular applications. To showcase this, we simulate deterministic regex in a tailored fragment of FC-Datalog.
@article{arxiv.2501.10344,
title = {FC-Datalog as a Framework for Efficient String Querying},
author = {Owen M. Bell and Joel D. Day and Dominik D. Freydenberger},
journal= {arXiv preprint arXiv:2501.10344},
year = {2025}
}