Related papers: Data Validation Infrastructure for R
Data validation is the activity where one decides whether or not a particular data set is fit for a given purpose. Formalizing the requirements that drive this decision process allows for unambiguous communication of the requirements,…
A software package has been developed to bridge the R analysis model with the conceptual analysis environment typical of radiation physics experiments. The new package has been used in the context of a project for the validation of…
The validation of a data-driven model is the process of assessing the model's ability to generalize to new, unseen data in the population of interest. This paper proposes a set of general rules for model validation. These rules are designed…
Industrial and scientific applications handle large volumes of data that render manual validation by humans infeasible. Therefore, we require automated data validation approaches that are able to consider the prior knowledge of domain…
Validation is often defined as the process of determining the degree to which a model is an accurate representation of the real world from the perspective of its intended uses. Validation is crucial as industries and governments depend…
We motivate and offer a formal definition of validation as it applies to information fusion systems. Common definitions of validation compare the actual state of the world with that derived by the fusion process. This definition conflates…
Traditionally, practitioners use formal methods pre-dominately for one half of the quality-assurance process: verification (do we build the software right?). The other half -- validation (do we build the right software?) -- has been given…
Purpose: The introduction of artificial intelligence / machine learning (AI/ML) products to the regulated fields of pharmaceutical research and development (R&D) and drug manufacture, and medical devices (MD) and in-vitro diagnostics (IVD),…
Data quality describes the degree to which data meet specific requirements and are fit for use by humans and/or downstream tasks (e.g., artificial intelligence). Data quality can be assessed across multiple high-level concepts called…
Validation is one of the software engineering disciplines that help build quality into software. The major objective of software validation process is to determine that the software performs its intended functions correctly and provide…
Validation is often defined as the process of determining the degree to which a model is an accurate representation of the real world from the perspective of its intended uses. Validation is crucial as industries and governments depend…
Data completeness is an essential aspect of data quality, and has in turn a huge impact on the effective management of companies. For example, statistics are computed and audits are conducted in companies by implicitly placing the strong…
Validation accuracy is a necessary, but not sufficient, measure of a neural network classifier's quality. High validation accuracy during development does not guarantee that a model is free of serious flaws, such as vulnerability to…
In the distributed and dynamic framework of the Web, data quality is a big challenge. The Linked Open Data (LOD) provides an enormous amount of data, the quality of which is difficult to control. Quality is intrinsically a matter of usage,…
Data-oriented applications, their users, and even the law require data of high quality. Research has divided the rather vague notion of data quality into various dimensions, such as accuracy, consistency, and reputation. To achieve the goal…
The objective of this research is the development of a practical system to manipulate and validate software package specifications. The validation process developed is based on consistency checks. Furthermore, by means of scenarios, the…
Dynamically typed programming languages like R allow programmers to write generic, flexible and concise code and to interact with the language using an interactive Read-eval-print-loop (REPL). However, this flexibility has its price: As the…
There has been a massive explosion of data generated by customers and retained by companies in the last decade. However, there is a significant mismatch between the increasing volume of data and the lack of automation methods and tools. The…
Assessing and improving the quality of data are fundamental challenges for data-intensive systems that have given rise to applications targeting transformation and cleaning of data. However, while schema design, data cleaning, and data…
Internal software quality determines the maintainability of the software product and influences the quality in use. There is a plethora of metrics which purport to measure the internal quality of software, and these metrics are offered by…