Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices
Abstract
A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2015
- DOI:
- 10.48550/arXiv.1509.00309
- arXiv:
- arXiv:1509.00309
- Bibcode:
- 2015arXiv150900309C
- Keywords:
-
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.05046