English

Designing Traceability into Big Data Systems

Databases 2015-02-06 v1

Abstract

Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.

Keywords

Cite

@article{arxiv.1502.01545,
  title  = {Designing Traceability into Big Data Systems},
  author = {Richard McClatchey and Andrew Branson and Jetendr Shamdasani and Zsolt Kovacs and the CRISTAL-ISE Consortium},
  journal= {arXiv preprint arXiv:1502.01545},
  year   = {2015}
}

Comments

10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.5753

R2 v1 2026-06-22T08:22:53.695Z