Related papers: The jsonlite Package: A Practical and Consistent M…
JSON (JavaScript Object Notation) is a data encoding that allows structured data to be used in a standardized and straightforward manner across systems. Schemas for JSON-formatted data can be constructed using the JSON Schema standard,…
Despite the fact that JSON is currently one of the most popular formats for exchanging data on the Web, there are very few studies on this topic and there are no agreement upon theoretical framework for dealing with JSON. There- fore in…
Regularized regression models are well studied and, under appropriate conditions, offer fast and statistically interpretable results. However, large data in many applications are heterogeneous in the sense of harboring distributional…
Semantic types are a more powerful and detailed way of describing data than atomic types such as strings or integers. They establish connections between columns and concepts from the real world, providing more nuanced and fine-grained…
JSON is an essential file and data format in do-mains that span scientific computing, web APIs or configuration management. Its popularity has motivated significant software development effort to build multiple libraries to process JSON…
We propose a logical framework, based on Datalog, to study the foundations of querying JSON data. The main feature of our approach, which we call J-Logic, is the emphasis on paths. Paths are sequences of keys and are used to access the tree…
Many applications model their data in a general-purpose storage format such as JSON. This data structure is modified by the application as a result of user input. Such modifications are well understood if performed sequentially on a single…
While there exist approaches to integrate heterogeneous data using semantic models, such semantic models can typically not be used by existing software tools. Many software tools - especially in engineering - only have options to import and…
JSON is a popular data format used pervasively in web APIs, cloud computing, NoSQL databases, and increasingly also machine learning. JSON Schema is a language for declaring the structure of valid JSON data. There are validators that can…
Recent workshops brought together several developers, educators and users of software packages extending popular languages for spatial data handling, with a primary focus on R, Python and Julia. Common challenges discussed included handling…
JSON Schema is the de facto standard for describing the structure of JSON documents. Reasoning about JSON Schema inclusion -- whether every instance satisfying a schema S1 also satisfies a schema S2 -- is a key building block for a variety…
Semi-structured data formats such as JSON have proved to be useful data models for applications that require flexibility in the format of data stored. However, JSON data often come without the schemas that are typically available with…
JSON is a popular standard for data interchange on the Internet. Ingesting JSON documents can be a performance bottleneck. A popular parsing strategy consists in converting the input text into a tree-based data structure -- sometimes called…
The rapid evolution in the fields of computer science, data science, and artificial intelligence has significantly transformed the utilisation of data for decision-making. Data visualisation plays a critical role in any work that involves…
Conflict-Free Replicated Data Types (CRDTs) for JSON allow users to concurrently update a JSON document and automatically merge the updates into a consistent state. Moving a subtree in a map or reordering elements in a list within a JSON…
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases…
Schema discovery is an important aspect to working with data in formats such as JSON. Unlike relational databases, JSON data sets often do not have associated structural information. Consumers of such datasets are often left to browse…
Everything that exists in R is an object [Chambers2016]. This article examines what would be possible if we kept copies of all R objects that have ever been created. Not only objects but also their properties, meta-data, relations with…
Functional and inclusion dependencies are the most widely used classes of data dependencies in data profiling due to their ability to identify relationships in data such as primary and foreign keys. These relationships are equally important…
The package hset for the R language contains an implementation of a S4 class for sets and multisets of numbers. The implementation, based on the hash table data structure from the package hash (Brown, 2019), allows for quick operations when…