Simpler is better: A comparative study of randomized algorithms for computing the CUR decomposition
Abstract
The CUR decomposition is a technique for lowrank approximation that selects small subsets of the columns and rows of a given matrix to use as bases for its column and rowspaces. It has recently attracted much interest, as it has several advantages over traditional low rank decompositions based on orthonormal bases. These include the preservation of properties such as sparsity or nonnegativity, the ability to interpret data, and reduced storage requirements. The problem of finding the skeleton sets that minimize the norm of the residual error is known to be NPhard, but classical pivoting schemes such as column pivoted QR work tend to work well in practice. When combined with randomized dimension reduction techniques, classical pivoting based methods become particularly effective, and have proven capable of very rapidly computing approximate CUR decompositions of large, potentially sparse, matrices. Another class of popular algorithms for computing CUR decompositions are based on drawing the columns and rows randomly from the full index sets, using specialized probability distributions based on leverage scores. Such sampling based techniques are particularly appealing for very large scale problems, and are well supported by theoretical performance guarantees. This manuscript provides a comparative study of the various randomized algorithms for computing CUR decompositions that have recently been proposed. Additionally, it proposes some modifications and simplifications to the existing algorithms that leads to faster execution times.
 Publication:

arXiv eprints
 Pub Date:
 April 2021
 DOI:
 10.48550/arXiv.2104.05877
 arXiv:
 arXiv:2104.05877
 Bibcode:
 2021arXiv210405877D
 Keywords:

 Mathematics  Numerical Analysis
 EPrint:
 21 pages, 7 figures