Streamlining Pipeline Workflows: Using Python with an Object-Oriented Approach to Consolidate Aggregate Pipeline Processes
Abstract
The Keck Observatory Archive (KOA), a collaboration between the NASA Exoplanet Science Institute and the W. M. Keck Observatory, serves science and calibration data for all current and retired instruments from the twin Keck Telescopes. In addition to the raw data, we publicly serve quick-look, reduced data products for four instruments (HIRES, LWS, NIRC2, NIRSPEC and OSIRIS), so that KOA users can easily assess the quality and scientific content of the data. In this paper we present the modernization of the Data Evaluation and Processing (DEP) Pipeline, our quality assurance tool to ensure science data is ready for archiving. Since there was no common infrastructure for data headers, the DEP pipeline had to evolve to accommodate new instruments through additional control paths each time an instrument was added or upgraded. Over time, new modules to assist with the processing were added in a variety of languages including IDL, C, CSH, PHP, and Python. The calls to multiple interpreters caused a lot of overhead. This project was an initiative to consolidate the DEP pipeline into a common language, Python, using an object-oriented approach. The object-oriented approach allows us to abstract out the differences and use common variables in place of instrument-specific values. As a result, new instruments only need a modified subclass with the differing values in order to work with the pipeline. By consolidating everything to Python, we have seen an increase in efficiency, ease of operation, and ease of maintenance.
- Publication:
-
Astronomical Data Analysis Software and Systems XXVII
- Pub Date:
- October 2019
- Bibcode:
- 2019ASPC..523..163B