Related papers: Data
In this chapter, concepts related to information and computation are reviewed in the context of human computation. A brief introduction to information theory and different types of computation is given. Two examples of human computation…
A feature concept, the essence of the data-federative innovation process, is presented as a model of the concept to be acquired from data. A feature concept may be a simple feature, such as a single variable, but is more likely to be a…
We need much better understanding of information processing and computation as its primary form. Future progress of new computational devices capable of dealing with problems of big data, internet of things, semantic web, cognitive robotics…
The plethora of existing data models and specific data modeling techniques is not only confusing but leads to complex, eclectic and inefficient designs of systems for data management and analytics. The main goal of this paper is to describe…
We describe a new logical data model, called the concept-oriented model (COM). It uses mathematical functions as first-class constructs for data representation and data processing as opposed to using exclusively sets in conventional…
Data comes in many forms. From a shallow perspective, they can be viewed as being either in structured (e.g., as a relation, as key-value pairs) or unstructured (e.g., text, image) formats. So far, machines have been fairly good at…
The concept of $typed$ $topology$ is introduced. In a typed topological space, some open sets are assigned "types", and topological concepts such as closure, connectedness can be defined using types. A finite data set in $R^2$ is a…
"Information Processing" is a recently launched buzzword whose meaning is vague and obscure even for the majority of its users. The reason for this is the lack of a suitable definition for the term "information". In my attempt to amend this…
This work continues the development of an intensional approach to computability initiated in previous work, in which programs and computations, rather than functions, constitute the primary objects of study. In this setting, models of…
We argue that while this discourse on data ethics is of critical importance, it is missing one fundamental point: If more and more efforts in business, government, science, and our daily lives are data-driven, we should pay more attention…
Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or…
This paper aims to present the different aspects and characteristics of strategic and operational information and propose a categorization pattern allowing to consider an information as strategic or operational. This categorization is to be…
Data Science is a complex and evolving field, but most agree that it can be defined as a combination of expertise drawn from three broad areascomputer science and technology, math and statistics, and domain knowledge -- with the purpose of…
The specifics of data layout can be important for the efficiency of functional programs and interaction with external libraries. In this paper, we develop a type-theoretic approach to data layout that could be used as a typed intermediate…
This article presents the top-level of an ontology categorizing and generalizing best practices and quality criteria or measures for Linked Data. It permits to compare these techniques and have a synthetic organized view of what can or…
The digital transformation of our society is a constant challenge, as data is generated in almost every digital interaction. To use data effectively, it must be of high quality. This raises the question: what exactly is data quality? A…
Category theory offers a mathematical foundation for knowledge representation and database systems. Popular existing approaches model a database instance as a functor into the category of sets and functions, or as a 2-functor into the…
The information in an individual finite object (like a binary string) is commonly measured by its Kolmogorov complexity. One can divide that information into two parts: the information accounting for the useful regularity present in the…
Type inference refers to the task of inferring the data type of a given column of data. Current approaches often fail when data contains missing data and anomalies, which are found commonly in real-world data sets. In this paper, we propose…
Graphs are a generalized concept that encompasses more complex data structures than trees, such as difference lists, doubly-linked lists, skip lists, and leaf-linked trees. Normally, these structures are handled with destructive assignments…