Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features
Abstract
Despite their success, kernel methods suffer from a massive computational cost in practice. In this paper, in lieu of commonly used kernel expansion with respect to $N$ inputs, we develop a novel optimal design maximizing the entropy among kernel features. This procedure results in a kernel expansion with respect to entropic optimal features (EOF), improving the data representation dramatically due to features dissimilarity. Under mild technical assumptions, our generalization bound shows that with only $O(N^{\frac{1}{4}})$ features (disregarding logarithmic factors), we can achieve the optimal statistical accuracy (i.e., $O(1/\sqrt{N})$). The salient feature of our design is its sparsity that significantly reduces the time and space cost. Our numerical experiments on benchmark datasets verify the superiority of EOF over the stateoftheart in kernel approximation.
 Publication:

arXiv eprints
 Pub Date:
 February 2020
 arXiv:
 arXiv:2002.04195
 Bibcode:
 2020arXiv200204195D
 Keywords:

 Computer Science  Machine Learning;
 Statistics  Machine Learning