Intrinsic persistent homology via density-based metric learning
Abstract
We address the problem of estimating topological features from data in high dimensional Euclidean spaces under the manifold assumption. Our approach is based on the computation of persistent homology of the space of data points endowed with a sample metric known as Fermat distance. We prove that such metric space converges almost surely to the manifold itself endowed with an intrinsic metric that accounts for both the geometry of the manifold and the density that produces the sample. This fact implies the convergence of the associated persistence diagrams. The use of this intrinsic distance when computing persistent homology presents advantageous properties such as robustness to the presence of outliers in the input data and less sensitiveness to the particular embedding of the underlying manifold in the ambient space. We use these ideas to propose and implement a method for pattern recognition and anomaly detection in time series, which is evaluated in applications to real data.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2020
- DOI:
- 10.48550/arXiv.2012.07621
- arXiv:
- arXiv:2012.07621
- Bibcode:
- 2020arXiv201207621F
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning;
- Mathematics - Algebraic Topology;
- Mathematics - Probability;
- 62G05;
- 62G20;
- 62-07;
- 57N16;
- 57N25;
- 55U10
- E-Print:
- 37 pages. v3: major revision. Final version accepted for publication at Journal of Machine Learning Research