Related papers: Optimizing ROOT IO For Analysis
We overview recent changes in the ROOT I/O system, increasing performance and enhancing it and improving its interaction with other data analysis ecosystems. Both the newly introduced compression algorithms, the much faster bulk I/O data…
ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high…
The LHCs Run3 will push the envelope on data-intensive workflows and, since at the lowest level this data is managed using the ROOT software framework, preparations for managing this data are starting already. At the beginning of LHC Run 1,…
Distinct HEP workflows have distinct I/O needs; while ROOT I/O excels at serializing complex C++ objects common to reconstruction, analysis workflows typically have simpler objects and can sustain higher event rates. To meet these…
When processing large amounts of data, the rate at which reading and writing can take place is a critical factor. High energy physics data processing relying on ROOT is no exception. The recent parallelisation of LHC experiments' software…
ROOT is a data analysis framework broadly used in and outside of High Energy Physics (HEP). Since HEP software frameworks always strive for performance improvements, ROOT was extended with experimental support of runtime C++ Modules. C++…
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a…
This document discusses the state, roadmap, and risks of the foundational components of ROOT with respect to the experiments at the HL-LHC (Run 4 and beyond). As foundational components, the document considers in particular the ROOT…
The ROOT TTree data format encodes hundreds of petabytes of High Energy and Nuclear Physics events. Its columnar layout drives rapid analyses, as only those parts ("branches") that are really used in a given analysis need to be read from…
RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider as well as at $B$ factories. Larger datasets to be collected at e.g. the High-Luminosity LHC will…
Despite recent efforts to collect multi-task, multi-embodiment datasets, to design recipes for training Vision-Language-Action models (VLAs), and to showcase these models on different robot platforms, generalist cross-embodiment robot…
Upcoming HEP experiments, e.g. at the HL-LHC, are expected to increase the volume of generated data by at least one order of magnitude. In order to retain the ability to analyze the influx of data, full exploitation of modern storage…
In this article, we present the High-Performance Output (HiPO) data format developed at Jefferson Laboratory for storing and analyzing data from Nuclear Physics experiments. The format was designed to efficiently store large amounts of…
For the last several months the main focus of development in the ROOT I/O package has been code consolidation and performance improvements. Access to remote files is affected both by bandwidth and latency. We introduced a pre-fetch…
Modern NVMe SSDs and RDMA networks provide dramatically higher bandwidth and concurrency. Existing networked storage systems (e.g., NVMe over Fabrics) fail to fully exploit these new devices due to inefficient storage ordering guarantees.…
Compared to LHC Run 1 and Run 2, future HEP experiments, e.g., at the HL-LHC, will increase the volume of generated data by an order of magnitude. In order to sustain the expected analysis throughput, ROOT's RNTuple I/O subsystem has been…
High Energy Physics (HEP) experiments, for example at the Large Hadron Collider (LHC) at CERN, store data at exabyte scale in sets of files. They use a binary columnar data format by the ROOT framework, that also transparently compresses…
The ROOT based Offline and Online Analysis (ROAn) framework was developed to perform data analysis on data from Depleted P-channel Field Effect Transistor (DePFET) detectors, a type of active pixel sensors developed at the MPI…
We present DIO, a generic tool for observing inefficient and erroneous I/O interactions between applications and in-kernel storage systems that lead to performance, dependability, and correctness issues. DIO facilitates the analysis and…
HEP-Frame is a new C++ package designed to efficiently perform analyses of data sets from a very large number of events, like those available at the Large Hadron Collider (LHC) at CERN, Geneva. It mainly targets high performance servers and…