A Self-adaptive Auto-scaling Method for Scientific Applications on HPC Environments and Clouds
Abstract
High intensive computation applications can usually take days to months to finish an execution. During this time, it is common to have variations of the available resources when considering that such hardware is usually shared among a plurality of researchers/departments within an organization. On the other hand, High Performance Clusters can take advantage of Cloud Computing bursting techniques for the execution of applications together with the on-premise resources. In order to meet deadlines, high intensive computational applications can use the Cloud to boost their performance when they are data and task parallel. This article presents an ongoing work towards the use of extended resources of an HPC execution platform together with Cloud. We propose an unified view of such heterogeneous environments and a method that monitors, predicts the application execution time, and dynamically shifts part of the domain -- previously running in local HPC hardware -- to be computed on the Cloud, meeting then a specific deadline. The method is exemplified along with a seismic application that, at runtime, adapts itself to move part of the processing to the Cloud (in a movement called bursting) and also auto-scales (the moved part) over cloud nodes. Our preliminary results show that there is an expected overhead for performing this movement and for synchronizing results, but our outcomes demonstrate it is an important feature for meeting deadlines in the case an on-premise cluster is overloaded or cannot provide the capacity needed for a particular project.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2014
- DOI:
- 10.48550/arXiv.1412.6392
- arXiv:
- arXiv:1412.6392
- Bibcode:
- 2014arXiv1412.6392M
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- Part of ADAPT Workshop proceedings, 2015 (arXiv:1412.2347)