OpenCluster: A Flexible Distributed Computing Framework for Astronomical Data Processing
Abstract
The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.
- Publication:
-
Publications of the Astronomical Society of the Pacific
- Pub Date:
- February 2017
- DOI:
- 10.1088/1538-3873/129/972/024001
- arXiv:
- arXiv:1701.04907
- Bibcode:
- 2017PASP..129b4001W
- Keywords:
-
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- doi:10.1088/1538-3873/129/972/024001