Related papers: Open Data Quality
Nowadays open data is entering the mainstream - it is free available for every stakeholder and is often used in business decision-making. It is important to be sure data is trustable and error-free as its quality problems can lead to huge…
The digital transformation of our society is a constant challenge, as data is generated in almost every digital interaction. To use data effectively, it must be of high quality. This raises the question: what exactly is data quality? A…
Data quality describes the degree to which data meet specific requirements and are fit for use by humans and/or downstream tasks (e.g., artificial intelligence). Data quality can be assessed across multiple high-level concepts called…
One of the most significant problems of Big Data is to extract knowledge through the huge amount of data. The usefulness of the extracted information depends strongly on data quality. In addition to the importance, data quality has recently…
Data-oriented applications, their users, and even the law require data of high quality. Research has divided the rather vague notion of data quality into various dimensions, such as accuracy, consistency, and reputation. To achieve the goal…
Data catalogs play a crucial role in modern data-driven organizations by facilitating the discovery, understanding, and utilization of diverse data assets. However, ensuring their quality and reliability is complex, especially in open and…
Data warehousing is continuously gaining importance as organizations are realizing the benefits of decision oriented data bases. However, the stumbling block to this rapid development is data quality issues at various stages of data…
In the distributed and dynamic framework of the Web, data quality is a big challenge. The Linked Open Data (LOD) provides an enormous amount of data, the quality of which is difficult to control. Quality is intrinsically a matter of usage,…
High-quality data is key to interpretable and trustworthy data analytics and the basis for meaningful data-driven decisions. In practical scenarios, data quality is typically associated with data preprocessing, profiling, and cleansing for…
Data quality is a key element for building and optimizing good learning models. Despite many attempts to characterize data quality, there is still a need for rigorous formalization and an efficient measure of the quality from available…
This paper presents a framework for assessing data and metadata quality within Open Data portals. Although a few benchmark frameworks already exist for this purpose, they are not yet detailed enough in both breadth and depth to make valid…
Data is of high quality if it is fit for its intended use. The quality of data is influenced by the underlying data model and its quality. One major quality problem is the heterogeneity of data as quality aspects such as understandability…
Software quality assurance has been a heated topic for several decades, but relatively few analyses were performed on open source software (OSS). As OSS has become very popular in our daily life, many researchers have been keen on the…
In order to introduce an integrated research information system, this will provide scientific institutions with the necessary information on research activities and research results in assured quality. Since data collection, duplication,…
With the rapid increase of published open datasets, it is crucial to support the open data progress in smart cities while considering the open data quality. In the Czech Republic, and its National Open Data Catalogue (NODC), the open…
Data completeness is an essential aspect of data quality, and has in turn a huge impact on the effective management of companies. For example, statistics are computed and audits are conducted in companies by implicitly placing the strong…
This report discusses the issues of data quality in biobanks. It presents the state-of-the-art in data quality: the definition of data quality, the dimensions of data quality, and the quality management system for achieving or describing…
A fundamental problem in the practice and teaching of data science is how to evaluate the quality of a given data analysis, which is different than the evaluation of the science or question underlying the data analysis. Previously, we…
Open data is an emerging paradigm to share large and diverse datasets -- primarily from governmental agencies, but also from other organizations -- with the goal to enable the exploitation of the data for societal, academic, and commercial…
Data is one of the most important assets of the information age, and its societal impact is undisputed. Yet, rigorous methods of assessing the quality of data are lacking. In this paper, we propose a formal definition for the quality of a…