Rta-dq-lib: Software Library to Perform Online Data Quality Analysis of Scientific Data
Abstract
The Cherenkov Telescope Array (CTA) is an initiative that is currently building the largest gamma-ray ground Observatory that ever existed. A Science Alert Generation (SAG) system, part of the Array Control and Data Acquisition (ACADA) system of the CTA Observatory, analyses online telescope data - arriving at an event rate of tens of kHz - to detect transient gamma-ray events. The SAG system also performs online data quality analysis to assess instrument health during acquisition: this analysis is crucial to confirm good detections. A Python and a C++ software library to perform online data quality analysis of CTA data, called rta-dq-lib, has been proposed for CTA. The Python version is dedicated to rapid prototyping of data quality use cases. The C++ version is optimized for maximum performance. The library allows users to define, through XML configuration files, the format of the input data and, for each data field, which quality checks must be performed and which types of aggregations and transformations must be applied. It internally translates the XML configuration into a direct acyclic computational graph that encodes dependencies of the computational tasks to be performed. This model allows the library to easily take advantage of parallelization at the thread level and the overall flexibility allow us to develop generic data quality analysis pipelines that could be reused in other applications.
- Publication:
-
Astronomical Data Analysis Software and Systems XXX
- Pub Date:
- July 2022
- DOI:
- 10.48550/arXiv.2105.08648
- arXiv:
- arXiv:2105.08648
- Bibcode:
- 2022ASPC..532..365B
- Keywords:
-
- Astrophysics - Instrumentation and Methods for Astrophysics
- E-Print:
- ADASS 2020 conference