Improving Metric Dimensionality Reduction with Distributed Topology
Abstract
We propose a novel approach to dimensionality reduction combining techniques of metric geometry and distributed persistent homology, in the form of a gradient-descent based method called DIPOLE. DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term. By fixing an initial embedding method (we use Isomap), DIPOLE can also be viewed as a full dimensionality-reduction pipeline. This framework is based on the strong theoretical and computational properties of distributed persistent homology and comes with the guarantee of almost sure convergence. We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets, both visually and in terms of precise quantitative metrics.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2021
- DOI:
- 10.48550/arXiv.2106.07613
- arXiv:
- arXiv:2106.07613
- Bibcode:
- 2021arXiv210607613W
- Keywords:
-
- Computer Science - Machine Learning;
- Mathematics - Algebraic Topology
- E-Print:
- fixed bug in code, replaced affected figures, minor improvements observed