Related papers: Open Data: Reverse Engineering and Maintenance Per…
High-quality data has become increasingly important to software engineers in designing and implementing today's software, for example, as an input to machine-learning algorithms and visualisation- and analytics-based features. Open data -…
Reverse engineering has been a standard practice in the hardware community for some time. It has only been within the last ten years that reverse engineering, or "program comprehension", has grown into the current sub-discipline of software…
Data engineering is one of the fastest-growing fields within machine learning (ML). As ML becomes more common, the appetite for data grows more ravenous. But ML requires more data than individual teams of data engineers can readily produce,…
Data management, which encompasses activities and strategies related to the storage, organization, and description of data and other research materials, helps ensure the usability of datasets -- both for the original research team and for…
Open science describes the movement of making any research artefact available to the public and includes, but is not limited to, open access, open data, and open source. While open science is becoming generally accepted as a norm in other…
Information and communication technologies are permeating all aspects of industrial and manufacturing systems, expediting the generation of large volumes of industrial data. This article surveys the recent literature on data management as…
The Web community has introduced a set of standards and technologies for representing, querying, and manipulating a globally distributed data structure known as the Web of Data. The proponents of the Web of Data envision much of the world's…
Open Science aims to foster openness and collaboration in research, leading to more significant scientific and social impact. However, practicing Open Science comes with several challenges and is currently not properly rewarded. In this…
Modern spreadsheet systems can be used to implement complex spreadsheet applications including data sheets, customized user forms and executable procedures written in a scripting language. These applications are often developed by…
The research discusses how (open) data quality could be described, what should be considered developing a data quality management solution and how it could be applied to open data to check its quality. The proposed approach focuses on…
Data-driven science is heralded as a new paradigm in materials science. In this field, data is the new resource, and knowledge is extracted from materials data sets that are too big or complex for traditional human reasoning - typically…
Easy and mostly free access to the internet has resulted in the growing use of open source software (OSS). However, it is a common perception that closed proprietary software is still superior in areas such as software maintenance and…
The nature of software re-engineering is to improve or transform existing software so it can be understood, controlled and reused as new software. Needs, the necessity of re-engineering software has greatly increased. The system software…
The Open Source Software movement has been growing exponentially for a number of years with no signs of slowing. Driving this growth is the widespread availability of libraries and frameworks that provide many functionalities. Developers…
The quality of the data in a dataset can have a substantial impact on the performance of a machine learning model that is trained and/or evaluated using the dataset. Effective dataset management, including tasks such as data cleanup,…
This paper reflects on a number of trends towards a more open and reproducible approach to geographic and spatial data science over recent years. In particular it considers trends towards Big Data, and the impacts this is having on spatial…
High-quality datasets are typically required for accomplishing data-driven tasks, such as training medical diagnosis models, predicting real-time traffic conditions, or conducting experiments to validate research hypotheses. Consequently,…
Open data for all New Yorkers is the tagline on New York City's open data website. Open government is being promoted at most countries of the western world. Government transparency levels are being measured by the amount of data they share…
The web of data has brought forth the need to preserve and sustain evolving information within linked datasets; however, a basic requirement of data preservation is the maintenance of the datasets' structural characteristics as well. As…
In open-source software development environments; textual, numerical and relationship-based data generated are of interest to researchers. Various data sets are available for this data, which is frequently used in areas such as software…