Streaming Graph Algorithms in the Massively Parallel Computation Model
Abstract
We initiate the study of graph algorithms in the streaming setting on massive distributed and parallel systems inspired by practical data processing systems. The objective is to design algorithms that can efficiently process evolving graphs via large batches of edge insertions and deletions using as little memory as possible. We focus on the nowadays canonical model for the study of theoretical algorithms for massive networks, the Massively Parallel Computation (MPC) model. We design MPC algorithms that efficiently process evolving graphs: in a constant number of rounds they can handle large batches of edge updates for problems such as connectivity, minimum spanning forest, and approximate matching while adhering to the most restrictive memory regime, in which the local memory per machine is strongly sublinear in the number of vertices and the total memory is sublinear in the graph size. These results improve upon earlier works in this area which rely on using larger total space, proportional to the size of the processed graph. Our work demonstrates that parallel algorithms can process dynamically changing graphs with asymptotically optimal utilization of MPC resources: parallel time, local memory, and total memory, while processing large batches of edge updates.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.10230
- Bibcode:
- 2025arXiv250110230C
- Keywords:
-
- Computer Science - Data Structures and Algorithms
- E-Print:
- 36 Pages. A preliminary version has been appeared in PODC 2024