Grassmannian Diffusion Maps-Based Dimension Reduction and Classification for High-Dimensional Data
Abstract
This work introduces the Grassmannian Diffusion Maps, a novel nonlinear dimensionality reduction technique that defines the affinity between points through their representation as low-dimensional subspaces corresponding to points on the Grassmann manifold. The method is designed for applications, such as image recognition and data-based classification of high-dimensional data that can be compactly represented in a lower dimensional subspace. The GDMaps is composed of two stages. The first is a pointwise linear dimensionality reduction wherein each high-dimensional object is mapped onto the Grassmann. The second stage is a multi-point nonlinear kernel-based dimension reduction using Diffusion maps to identify the subspace structure of the points on the Grassmann manifold. To this aim, an appropriate Grassmannian kernel is used to construct the transition matrix of a random walk on a graph connecting points on the Grassmann manifold. Spectral analysis of the transition matrix yields low-dimensional Grassmannian diffusion coordinates embedding the data into a low-dimensional reproducing kernel Hilbert space. Further, a novel data classification/recognition technique is developed based on the construction of an overcomplete dictionary of reduced dimension whose atoms are given by the Grassmannian diffusion coordinates. Three examples are considered. First, a "toy" example shows that the GDMaps can identify an appropriate parametrization of structured points on the unit sphere. The second example demonstrates the ability of the GDMaps to reveal the intrinsic subspace structure of high-dimensional random field data. In the last example, a face recognition problem is solved considering face images subject to varying illumination conditions, changes in face expressions, and occurrence of occlusions.
- Publication:
-
SIAM Journal on Scientific Computing
- Pub Date:
- April 2022
- DOI:
- 10.1137/20M137001X
- arXiv:
- arXiv:2009.07547
- Bibcode:
- 2022SJSC...44B.250D
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning;
- 53Z50;
- 14M15;
- 60J20