Optimizing CMS build infrastructure via Apache Mesos
Abstract
The Offline Software of the CMS Experiment at the Large Hadron Collider (LHC) at CERN consists of 6M lines of in-house code, developed over a decade by nearly 1000 physicists, as well as a comparable amount of general use open-source code. A critical ingredient to the success of the construction and early operation of the WLCG was the convergence, around the year 2000, on the use of a homogeneous environment of commodity x86-64 processors and Linux.
Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other applications on a dynamically shared pool of nodes. We present how we migrated our continuous integration system to schedule jobs on a relatively small Apache Mesos enabled cluster and how this resulted in better resource usage, higher peak performance and lower latency thanks to the dynamic scheduling capabilities of Mesos.- Publication:
-
Journal of Physics Conference Series
- Pub Date:
- December 2015
- DOI:
- 10.1088/1742-6596/664/6/062013
- arXiv:
- arXiv:1507.07429
- Bibcode:
- 2015JPhCS.664f2013A
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- High Energy Physics - Experiment
- E-Print:
- Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan