Related papers: The HSF Conditions Database Reference Implementati…

Experience with the Open Source based implementation for ATLAS Conditions Data Management System

Conditions Data in high energy physics experiments is frequently seen as every data needed for reconstruction besides the event data itself. This includes all sorts of slowly evolving data like detector alignment, calibration and…

Databases · Computer Science 2007-05-23 A. Amorim , J. Lima , C. Oliveira , L. Pedro , N. Barros

HEP Software Foundation Community White Paper Working Group -- Conditions Data

To produce the best physics results, high energy physics experiments require access to calibration and other non-event data during event data processing. These conditions data are typically stored in databases that provide versioning…

Computational Physics · Physics 2019-01-17 Paul Laycock , Marko Bracko , Marco Clemencic , Dave Dykstra , Andrea Formica , Giacomo Govi , Michel Jouvin , David Lange , Lynn Wood

The Storage And Analytics Potential Of HBase Over The Cloud: A Survey

Apache HBase, a mainstay of the emerging Hadoop ecosystem, is a NoSQL key-value and column family hybrid database which, unlike a traditional RDBMS, is intentionally designed to scalably host large, semistructured, and heterogeneous data.…

Databases · Computer Science 2017-02-23 Georgios Drakopoulos , Andreas Kanavos , Christos Makris , Vasileios Megalooikonomou

Benchmarking DataStax Enterprise/Cassandra with HiBench

This report evaluates the new analytical capabilities of DataStax Enterprise (DSE) [1] through the use of standard Hadoop workloads. In particular, we run experiments with CPU and I/O bound micro-benchmarks as well as OLAP-style analytical…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-12-17 Todor Ivanov , Raik Niemann , Sead Izberovic , Marten Rosselli , Karsten Tolle , Roberto V. Zicari

Methods for Linking Data to Online Resources and Ontologies with Applications to Neurophysiology

Across many domains, large swaths of digital assets are being stored across distributed data repositories, e.g., the DANDI Archive [8]. The distribution and diversity of these repositories impede researchers from formally defining…

Databases · Computer Science 2024-06-04 Matthew Avaylon , Ryan Ly , Andrew Tritt , Benjamin Dichter , Kristofer E. Bouchard , Christopher J. Mungall , Oliver Ruebel

Scalable Database Access Technologies for ATLAS Distributed Computing

ATLAS event data processing requires access to non-event data (detector conditions, calibrations, etc.) stored in relational databases. The database-resident data are crucial for the event data reconstruction processing steps and often…

Instrumentation and Detectors · Physics 2019-08-13 A. Vaniachine

The Event as an Object-Relational Database: Avoiding the Dependency Nightmare

With the use of object-oriented languages for HEP, many experiments have designed their data objects to contain direct references to other objects in the event (e.g., tracks and electromagnetic showers have references to each other to…

High Energy Physics - Experiment · Physics 2007-05-23 C. D. Jones

HRDBMS: Combining the Best of Modern and Traditional Relational Databases

HRDBMS is a novel distributed relational database that uses a hybrid model combining the best of traditional distributed relational databases and Big Data analytics platforms such as Hive. This allows HRDBMS to leverage years worth of…

Databases · Computer Science 2019-01-28 Jason Arnold , Boris Glavic , Ioan Raicu

LHC Databases on the Grid: Achievements and Open Issues

To extract physics results from the recorded data, the LHC experiments are using Grid computing infrastructure. The event data processing on the Grid requires scalable access to non-event data (detector conditions, calibrations, etc.)…

Databases · Computer Science 2010-07-13 A. V. Vaniachine

Benchmarking the Open Science Data Federation services to develop XRootD best practices

Research has become dependent on processing power and storage, one crucial aspect being data sharing. The Open Science Data Federation (OSDF) project aims to create a scientific global data distribution network based on the Pelican…

Information Retrieval · Computer Science 2026-05-14 Fabio Andrijauskas , Igor Sfiligoi , Frank Würthwein

Evaluating NoSQL Databases for OLAP Workloads: A Benchmarking Study of MongoDB, Redis, Kudu and ArangoDB

In the era of big data, conventional RDBMS models have become impractical for handling colossal workloads. Consequently, NoSQL databases have emerged as the preferred storage solutions for executing processing-intensive Online Analytical…

Databases · Computer Science 2024-05-29 Rishi Kesav Mohan , Risheek Rakshit Sukumar Kanmani , Krishna Anandan Ganesan , Nisha Ramasubramanian

hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems

As storage systems become increasingly heterogeneous and complex, it adds burdens on DBAs, causing suboptimal performance even after a lot of human efforts have been made. In addition, existing monitoring-based storage management by access…

Databases · Computer Science 2012-07-03 Tian Luo , Rubao Lee , Michael Mesnier , Feng Chen , Xiaodong Zhang

An Experimental Evaluation of Performance of A Hadoop Cluster on Replica Management

Hadoop is an open source implementation of the MapReduce Framework in the realm of distributed processing. A Hadoop cluster is a unique type of computational cluster designed for storing and analyzing large data sets across cluster of…

Distributed, Parallel, and Cluster Computing · Computer Science 2014-11-10 Muralikrishnan Ramane , Sharmila Krishnamoorthy , Sasikala Gowtham

Hierarchical Event Descriptor library schema for EEG data annotation

Standardizing terminology to annotate electrophysiological events can improve both computational research and clinical care. Sharing data enriched with standard terms can facilitate data exploration, from case studies to mega-analyses. The…

Neurons and Cognition · Quantitative Biology 2024-10-29 Dora Hermes , Tal Pal Attia , Sándor Beniczky , Jorge Bosch-Bayard , Arnaud Delorme , Brian Nils Lundstrom , Christine Rogers , Stefan Rampp , Seyed Yahya Shirazi , Dung Truong , Pedro Valdes-Sosa , Greg Worrell , Scott Makeig , Kay Robbins

HISTEX (HISTory EXerciser) : A tool for testing the implementation of Isolation Levels of Relational Database Management Systems

We present a multi-process application called HISTEX (HISTory EXerciser), which executes input histories in a generic transactional notation on commercial DBMS platforms. HISTEX could be used to discover potential errors in the…

Databases · Computer Science 2019-03-05 Dimitrios Liarokapis , Elizabeth ONeil , Patrick ONeil

Persistent storage of non-event data in the CMS databases

In the CMS experiment, the non event data needed to set up the detector, or being produced by it, and needed to calibrate the physical responses of the detector itself are stored in ORACLE databases. The large amount of data to be stored,…

Instrumentation and Detectors · Physics 2015-05-14 M. De Gruttola , S. Di Guida , D. Futyan , F. Glege , G. Govi , V. Innocente , P. Paolucci , P. Picca , A. Pierro , D. Schlatter , Z. Xie

A Study on Messaging Trade-offs in Data Streaming for Scientific Workflows

Memory-to-memory data streaming is essential for modern scientific workflows that require near real-time data analysis, experimental steering, and informed decision-making during experiment execution. It eliminates the latency bottlenecks…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-10 Anjus George , Michael J. Brim , Christopher Zimmer , Tyler J. Skluzacek , A. J. Ruckman , Gustav R. Jansen , Sarp Oral

Warehousing Web Data

In a data warehousing process, mastering the data preparation phase allows substantial gains in terms of time and performance when performing multidimensional analysis or using data mining algorithms. Furthermore, a data warehouse can…

Databases · Computer Science 2007-05-23 Jérôme Darmont , Omar Boussaïd , Fadila Bentayeb

The Forgotten Document-Oriented Database Management Systems: An Overview and Benchmark of Native XML DODBMSes in Comparison with JSON DODBMSes

In the current context of Big Data, a multitude of new NoSQL solutions for storing, managing, and extracting information and patterns from semi-structured data have been proposed and implemented. These solutions were developed to relieve…

Databases · Computer Science 2021-02-05 Ciprian-Octavian Truică , Elena-Simona Apostol , Jérôme Darmont , Torben Bach Pedersen

Distributed Heterogeneous Relational Data Warehouse In A Grid Environment

This paper examines how a "Distributed Heterogeneous Relational Data Warehouse" can be integrated in a Grid environment that will provide physicists with efficient access to large and small object collections drawn from databases at…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-31 Saima Iqbal , Julian J. Bunn , Harvey B. Newman