Is Input Sparsity Time Possible for Kernel LowRank Approximation?
Abstract
Lowrank approximation is a common tool used to accelerate kernel methods: the $n \times n$ kernel matrix $K$ is approximated via a rank$k$ matrix $\tilde K$ which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient lowrank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, computing a relative error $k$rank approximation to $K$ is at least as difficult as multiplying the input data matrix $A \in \mathbb{R}^{n \times d}$ by an arbitrary matrix $C \in \mathbb{R}^{d \times k}$. Barring a breakthrough in fast matrix multiplication, when $k$ is not too large, this requires $\Omega(nnz(A)k)$ time where $nnz(A)$ is the number of nonzeros in $A$. This lower bound matches, in many parameter regimes, recent work on subquadratic time algorithms for lowrank approximation of general kernels [MM16,MW17], demonstrating that these algorithms are unlikely to be significantly improved, in particular to $O(nnz(A))$ input sparsity runtimes. At the same time there is hope: we show for the first time that $O(nnz(A))$ time approximation is possible for general radial basis function kernels (e.g., the Gaussian kernel) for the closely related problem of lowrank approximation of the kernelized dataset.
 Publication:

arXiv eprints
 Pub Date:
 November 2017
 arXiv:
 arXiv:1711.01596
 Bibcode:
 2017arXiv171101596M
 Keywords:

 Computer Science  Data Structures and Algorithms;
 Computer Science  Machine Learning
 EPrint:
 To appear, NIPS 2017