A Meta-graph Approach to Analyze Subgraph-centric Distributed Programming Models
Abstract
Component-centric distributed graph processing platforms that use a bulk synchronous parallel (BSP) programming model have gained traction. These address the short-comings of Big Data abstractions/platforms like MapReduce/Hadoop for large-scale graph processing. However, there is limited literature on foundational aspects of the behavior of these component-centric abstractions for different graphs, graph partitioning, and graph algorithms. Here, we propose a analytical approach based on a meta-graph sketch to examine the characteristics of component-centric graph programming models at a coarse granularity. In particular, we apply this sketch to subgraph- and block-centric abstractions, and draw a comparison with vertex-centric models like Google's Pregel. First, we explore the impact of various graph partitioning techniques on the meta-graph, and next consider the impact of the meta-graph on graph algorithms. This decouples the unwieldy large graph and their partitioning specific artifacts from their algorithmic analysis. We use 5 spatial and powerlaw graphs as exemplars, four different partitioning strategies, and PageRank and Breadth First Search as canonical algorithms. These analysis over the meta-graphs provide a reliable measure of the expected number of supersteps, and the communication and computational complexity of the algorithms for various graphs, and the relative merits of subgraph-centric models over vertex-centric ones.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2015
- DOI:
- 10.48550/arXiv.1508.04265
- arXiv:
- arXiv:1508.04265
- Bibcode:
- 2015arXiv150804265D
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- Proceedings of the IEEE International Conference on Big Data (Big Data), Washington DC, 2016