A Local SimilarityPreserving Framework for Nonlinear Dimensionality Reduction with Neural Networks
Abstract
Realworld data usually have high dimensionality and it is important to mitigate the curse of dimensionality. Highdimensional data are usually in a coherent structure and make the data in relatively small true degrees of freedom. There are global and local dimensionality reduction methods to alleviate the problem. Most of existing methods for local dimensionality reduction obtain an embedding with the eigenvalue or singular value decomposition, where the computational complexities are very high for a large amount of data. Here we propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction, which generalizes recent advancements in embedding representation learning of words to dimensionality reduction of matrices. It obtains the nonlinear embedding using a neural network with only one hidden layer to reduce the computational complexity. To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points by exploiting the random walk properties. Experiments demenstrate that Vec2vec is more efficient than several stateoftheart local dimensionality reduction methods in a large number of highdimensional data. Extensive experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test, and it is competitive with recently developed stateoftheart UMAP.
 Publication:

arXiv eprints
 Pub Date:
 March 2021
 DOI:
 10.48550/arXiv.2103.06383
 arXiv:
 arXiv:2103.06383
 Bibcode:
 2021arXiv210306383W
 Keywords:

 Computer Science  Machine Learning
 EPrint:
 Vec2vec, Dasfaa 2021, 16 pages