Provably Delay Efficient Data Retrieving in Storage Clouds
Abstract
One key requirement for storage clouds is to be able to retrieve data quickly. Recent system measurements have shown that the data retrieving delay in storage clouds is highly variable, which may result in a long latency tail. One crucial idea to improve the delay performance is to retrieve multiple data copies by using parallel downloading threads. However, how to optimally schedule these downloading threads to minimize the data retrieving delay remains to be an important open problem. In this paper, we develop low-complexity thread scheduling policies for several important classes of data downloading time distributions, and prove that these policies are either delay-optimal or within a constant gap from the optimum delay performance. These theoretical results hold for an arbitrary arrival process of read requests that may contain finite or infinite read requests, and for heterogeneous MDS storage codes that can support diverse storage redundancy and reliability requirements for different data files. Our numerical results show that the delay performance of the proposed policies is significantly better than that of First-Come- First-Served (FCFS) policies considered in prior work.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2015
- DOI:
- 10.48550/arXiv.1501.01661
- arXiv:
- arXiv:1501.01661
- Bibcode:
- 2015arXiv150101661S
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Computer Science - Information Theory
- E-Print:
- 17 pages, 4 figures. This is the technical report for a conference paper accepted by IEEE INFOCOM 2015