Communication-optimal parallel and sequential QR and LU factorizations: theory and practice
Abstract
We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform, and just as stable as Householder QR. Our first algorithm, Tall Skinny QR (TSQR), factors m-by-n matrices in a one-dimensional (1-D) block cyclic row layout, and is optimized for m >> n. Our second algorithm, CAQR (Communication-Avoiding QR), factors general rectangular matrices distributed in a two-dimensional block cyclic layout. It invokes TSQR for each block column factorization.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2008
- DOI:
- 10.48550/arXiv.0806.2159
- arXiv:
- arXiv:0806.2159
- Bibcode:
- 2008arXiv0806.2159D
- Keywords:
-
- Computer Science - Numerical Analysis