Advances in monitoring of grid services in WLCG
Abstract
During 2006, the Worldwide LHC Computing Grid Project (WLCG) constituted several working groups in the area of fabric and application monitoring with the mandate of improving the reliability and availability of the grid infrastructure through improved monitoring of the grid fabric. This paper discusses the work of one of these groups: the 'Grid Service Monitoring Working Group'. This group has the aim to evaluate the existing monitoring system and create a coherent architecture that would let the existing system run, while increasing the quality and quantity of monitoring information gathered. We describe the stakeholders in this project, and focus in particular on the needs of the site administrators, which were not well satisfied by existing solutions. Several standards for service metric gathering and grid monitoring data exchange, and the place of each in the architecture will be shown. Finally we will describe the use of a Nagios-based prototype deployment for validation of our ideas, and the progress on turning this prototype into a production-ready system.
- Publication:
-
Journal of Physics Conference Series
- Pub Date:
- July 2008
- DOI:
- 10.1088/1742-6596/119/6/062021
- Bibcode:
- 2008JPhCS.119f2021C