English
Related papers

Related papers: Validating Streaming JSON Documents with Learned V…

200 papers

Some of the most relevant document schemas used online, such as XML and JSON, have a nested format. In the last decade, the task of extracting data from nested documents over streams has become especially relevant. We focus on the streaming…

Databases · Computer Science 2022-01-11 Martín Muñoz , Cristian Riveros

JSON is a popular data format used pervasively in web APIs, cloud computing, NoSQL databases, and increasingly also machine learning. JSON Schema is a language for declaring the structure of valid JSON data. There are validators that can…

Programming Languages · Computer Science 2020-05-19 Andrew Habib , Avraham Shinnar , Martin Hirzel , Michael Pradel

JSON Schema is an important, evolving standard schema language for families of JSON documents. It is based on a complex combination of structural and Boolean assertions, and features negation and recursion. The static analysis of JSON…

Visibly pushdown automata (VPA), introduced by Alur and Madhusuan in 2004, is a subclass of pushdown automata whose stack behavior is completely determined by the input symbol according to a fixed partition of the input alphabet. Since its…

Formal Languages and Automata Theory · Computer Science 2009-11-18 Nguyen Van Tang

JSON Schema is the de facto standard for describing the structure of JSON documents. Reasoning about JSON Schema inclusion -- whether every instance satisfying a schema S1 also satisfies a schema S2 -- is a key building block for a variety…

JSON Schemas provide useful guardrails for developers of Web APIs to guarantee that the semi-structured JSON input provided by clients matches a predefined structure. This is important both to ensure the correctness of the data received as…

Databases · Computer Science 2025-03-12 Juan Cruz Viotti , Michael J. Mior

LLM agents routinely serve as first (and sometimes only) readers of academic papers, skimming for sub-claims, extracting reproducibility steps, and generalizing scope. Standard prose papers produce recurring failures in this role:…

Digital Libraries · Computer Science 2026-05-18 Arquimedes Canedo

JSON Schema is the de-facto standard schema language for JSON data. The language went through many minor revisions, but the most recent versions of the language added two novel features, dynamic references and annotation-dependent…

JSON is a popular standard for data interchange on the Internet. Ingesting JSON documents can be a performance bottleneck. A popular parsing strategy consists in converting the input text into a tree-based data structure -- sometimes called…

Databases · Computer Science 2024-08-02 John Keiser , Daniel Lemire

JSON (JavaScript Object Notation) is a data encoding that allows structured data to be used in a standardized and straightforward manner across systems. Schemas for JSON-formatted data can be constructed using the JSON Schema standard,…

Programming Languages · Computer Science 2025-08-13 Jack Stanek , Daniel Killough

In the context of language recognition, we demonstrate the superiority of streaming property testers against streaming algorithms and property testers, when they are not combined. Initiated by Feigenbaum et al., a streaming property tester…

Data Structures and Algorithms · Computer Science 2015-11-04 Nathanaël François , Frédéric Magniez , Michel de Rougemont , Olivier Serre

An underlying assumption in conventional multi-view learning algorithms is that all views can be simultaneously accessed. However, due to various factors when collecting and pre-processing data from different views, the streaming view…

Machine Learning · Statistics 2016-04-29 Chang Xu , Dacheng Tao , Chao Xu

False-positives are a problem in anomaly-based intrusion detection systems. To counter this issue, we discuss anomaly detection for the eXtensible Markup Language (XML) in a language-theoretic view. We argue that many XML-based attacks…

Cryptography and Security · Computer Science 2013-11-13 Harald Lampesberger

We study the problem of validating XML documents of size $N$ against general DTDs in the context of streaming algorithms. The starting point of this work is a well-known space lower bound. There are XML documents and DTDs for which $p$-pass…

Data Structures and Algorithms · Computer Science 2011-08-19 Christian Konrad , Frederic Magniez

We study which property testing and sublinear time algorithms can be transformed into graph streaming algorithms for random order streams. Our main result is that for bounded degree graphs, any property that is constant-query testable in…

Data Structures and Algorithms · Computer Science 2017-07-25 Morteza Monemizadeh , S. Muthukrishnan , Pan Peng , Christian Sohler

In real-world contexts, sometimes data are available in form of Natural Data Streams, i.e. data characterized by a streaming nature, unbalanced distribution, data drift over a long time frame and strong correlation of samples in short time…

Computer Vision and Pattern Recognition · Computer Science 2023-01-10 Guido Borghi , Gabriele Graffieti , Davide Maltoni

JSON Schema is an evolving standard for the description of families of JSON documents. JSON Schema is a logical language, based on a set of assertions that describe features of the JSON value under analysis and on logical or structural…

Databases · Computer Science 2021-05-10 Mohamed-Amine Baazizi , Dario Colazzo , Giorgio Ghelli , Carlo Sartiani , Stefanie Scherzinger

We introduce streaming data string transducers that map input data strings to output data strings in a single left-to-right pass in linear time. Data strings are (unbounded) sequences of data values, tagged with symbols from a finite set,…

Programming Languages · Computer Science 2011-02-15 Rajeev Alur , Pavol Cerny

Schema discovery is an important aspect to working with data in formats such as JSON. Unlike relational databases, JSON data sets often do not have associated structural information. Consumers of such datasets are often left to browse…

Databases · Computer Science 2023-07-07 Michael J. Mior

With the ubiquity of computer vision in industry, the importance of image provenance is becoming more apparent. Provenance provides information about the origin and derivation of some resource, e.g., an image dataset, enabling users to…

Machine Learning · Computer Science 2026-03-31 Lynn Vonderhaar , Timothy Elvira , Tyler Thomas Procko , Omar Ochoa
‹ Prev 1 2 3 10 Next ›