Using Dimensionality Reduction to Optimize t-SNE

doi:10.48550/arXiv.1912.01098

Using Dimensionality Reduction to Optimize t-SNE

t-SNE is a popular tool for embedding multi-dimensional datasets into two or three dimensions. However, it has a large computational cost, especially when the input data has many dimensions. Many use t-SNE to embed the output of a neural network, which is generally of much lower dimension than the original data. This limits the use of t-SNE in unsupervised scenarios. We propose using \textit{random} projections to embed high dimensional datasets into relatively few dimensions, and then using t-SNE to obtain a two dimensional embedding. We show that random projections preserve the desirable clustering achieved by t-SNE, while dramatically reducing the runtime of finding the embedding.

Publication:

arXiv e-prints

Pub Date:

December 2019

DOI:

10.48550/arXiv.1912.01098

arXiv:

arXiv:1912.01098

Bibcode:

2019arXiv191201098S

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

11th Annual Workshop on Optimization for Machine Learning (OPT2019 )

NASA/ADS

Using Dimensionality Reduction to Optimize t-SNE

Abstract