Validating Streaming JSON Documents with Learned VPAs
Abstract
We present a new streaming algorithm to validate JSON documents against a set of constraints given as a JSON schema. Among the possible values a JSON document can hold, objects are unordered collections of key-value pairs while arrays are ordered collections of values. We prove that there always exists a visibly pushdown automaton (VPA) that accepts the same set of JSON documents as a JSON schema. Leveraging this result, our approach relies on learning a VPA for the provided schema. As the learned VPA assumes a fixed order on the key-value pairs of the objects, we abstract its transitions in a special kind of graph, and propose an efficient streaming algorithm using the VPA and its graph to decide whether a JSON document is valid for the schema. We evaluate the implementation of our algorithm on a number of random JSON documents, and compare it to the classical validation algorithm.
Cite
@article{arxiv.2211.08891,
title = {Validating Streaming JSON Documents with Learned VPAs},
author = {Véronique Bruyère and Guillermo A. Perez and Gaëtan Staquet},
journal= {arXiv preprint arXiv:2211.08891},
year = {2023}
}
Comments
46 pages, 10 figures, published at TACAS 2023