Random Projections and Dimension Reduction
Abstract
This paper, broadly speaking, covers the use of randomness in two main areas: low-rank approximation and kernel methods. Low-rank approximation is very important in numerical linear algebra. Many applications depend on matrix decomposition algorithms that provide accurate low-rank representations of data. In modern problems, however, various factors make this hard to accomplish. One solution to these problems is the use of random projections. Instead of directly computing the matrix factorization, we randomly project the matrix onto a lower-dimensional subspace and then compute the factorization. Often, we are able to do this without significant loss of accuracy. We describe how randomization can be used to create more efficient algorithms to perform low-rank matrix approximation, as well as introducing a novel randomized algorithm for matrix decomposition. Compared to standard approaches, random algorithms are often faster and more robust. With these randomized algorithms, analyzing massive data sets becomes tractable. Kernel methods are almost diametrically opposite from low-rank approximation. The idea is to project low-dimensional data into a higher-dimensional 'feature space,' such that it is linear separable in the feature space. This enables the model to learn a nonlinear separation of the data. As before, with large data matrices, computing the kernel matrix can be expensive, so we use randomized methods to approximate the matrix. In addition, we propose an extension of the random Fourier features kernel in which hyperparameter values are randomly sampled from an interval or Borel set. The experiments discussed in this paper can be found on our website at https://rishi1999.github.io/random-projections.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2020
- DOI:
- 10.48550/arXiv.2008.04552
- arXiv:
- arXiv:2008.04552
- Bibcode:
- 2020arXiv200804552A
- Keywords:
-
- Mathematics - Numerical Analysis;
- 65F55 (Primary) 62H30 (Secondary);
- G.1.3
- E-Print:
- 30 pages