The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux. Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuos integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.
@article{arxiv.1507.07429,
title = {Optimizing CMS build infrastructure via Apache Mesos},
author = {David Abdurachmanov and Alessandro Degano and Peter Elmer and Giulio Eulisse and David Mendez and Shahzad Muzaffar},
journal= {arXiv preprint arXiv:1507.07429},
year = {2016}
}
Comments
Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan