Approximate Nearest Neighbors in the Space of Persistence Diagrams
Abstract
Persistence diagrams are important tools in the field of topological data analysis that describe the presence and magnitude of features in a filtered topological space. However, current approaches for comparing a persistence diagram to a set of other persistence diagrams is linear in the number of diagrams or do not offer performance guarantees. In this paper, we apply concepts from localitysensitive hashing to support approximate nearest neighbor search in the space of persistence diagrams. Given a set $\Gamma$ of $n$ $(M,m)$bounded persistence diagrams, each with at most $m$ points, we snapround the points of each diagram to points on a cubical lattice and produce a key for each possible snaprounding. Specifically, we fix a grid over each diagram at several resolutions and consider the snaproundings of each diagram to the four nearest lattice points. Then, we propose a data structure with $\tau$ levels $\mathbb{D}_{\tau}$ that stores all snaproundings of each persistence diagram in $\Gamma$ at each resolution. This data structure has size $O(n5^m\tau)$ to account for varying lattice resolutions as well as snaproundings and the deletion of points with low persistence. To search for a persistence diagram, we compute a key for a query diagram by snapping each point to a lattice and deleting points of low persistence. Furthermore, as the lattice parameter decreases, searching our data structure yields a sixapproximation of the nearest diagram in $\Gamma$ in $O((m\log{n}+m^2)\log\tau)$ time and a constant factor approximation of the $k$th nearest diagram in $O((m\log{n}+m^2+k)\log\tau)$ time.
 Publication:

arXiv eprints
 Pub Date:
 December 2018
 arXiv:
 arXiv:1812.11257
 Bibcode:
 2018arXiv181211257F
 Keywords:

 Computer Science  Computational Geometry