A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Abstract
Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2005
- DOI:
- 10.48550/arXiv.cs/0506034
- arXiv:
- arXiv:cs/0506034
- Bibcode:
- 2005cs........6034V
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Computer Science - Computational Engineering;
- Finance;
- and Science;
- A.1;
- C.2.4;
- J.2
- E-Print:
- 46 pages, 16 figures, Technical Report