Parallel Algorithms for Adding a Collection of Sparse Matrices
Abstract
We develop a family of parallel algorithms for the SpKAdd operation that adds a collection of k sparse matrices. SpKAdd is a much needed operation in many applications including distributed memory sparse matrix-matrix multiplication (SpGEMM), streaming accumulations of graphs, and algorithmic sparsification of the gradient updates in deep learning. While adding two sparse matrices is a common operation in Matlab, Python, Intel MKL, and various GraphBLAS libraries, these implementations do not perform well when adding a large collection of sparse matrices. We develop a series of algorithms using tree merging, heap, sparse accumulator, hash table, and sliding hash table data structures. Among them, hash-based algorithms attain the theoretical lower bounds both on the computational and I/O complexities and perform the best in practice. The newly-developed hash SpKAdd makes the computation of a distributed-memory SpGEMM algorithm at least 2x faster than that the previous state-of-the-art algorithms.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2021
- DOI:
- arXiv:
- arXiv:2112.10223
- Bibcode:
- 2021arXiv211210223T
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing