Checkpointing strategies with prediction windows
Abstract
This paper deals with the impact of fault prediction techniques on checkpointing strategies. We suppose that the fault-prediction system provides prediction windows instead of exact predictions, which dramatically complicates the analysis of the checkpointing strategies. We propose a new approach based upon two periodic modes, a regular mode outside prediction windows, and a proactive mode inside prediction windows, whenever the size of these windows is large enough. We are able to compute the best period for any size of the prediction windows, thereby deriving the scheduling strategy that minimizes platform waste. In addition, the results of this analytical evaluation are nicely corroborated by a comprehensive set of simulations, which demonstrate the validity of the model and the accuracy of the approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2013
- DOI:
- arXiv:
- arXiv:1302.4558
- Bibcode:
- 2013arXiv1302.4558A
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- 35 pages, work supported by ANR Rescue. arXiv admin note: substantial text overlap with arXiv:1207.6936, arXiv:1302.3752