The $\ell^\infty$-Cophenetic Metric for Phylogenetic Trees as an Interleaving Distance
Abstract
There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the $\ell^\infty$-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with $n$ labeled leaves as a point in $\mathbb{R}^{n(n+1)/2}$ known as the cophenetic vector, then comparing the two resulting Euclidean points using the $\ell^\infty$ distance. Meanwhile, the interleaving distance is a formal categorical construction generalized from the definition of Chazal et al., originally introduced to compare persistence modules arising from the field of topological data analysis. We show that the $\ell^\infty$-cophenetic metric is an example of an interleaving distance. To do this, we define phylogenetic trees as a category of merge trees with some additional structure; namely labelings on the leaves plus a requirement that morphisms respect these labels. Then we can use the definition of a flow on this category to give an interleaving distance. Finally, we show that, because of the additional structure given by the categories defined, the map sending a labeled merge tree to the cophenetic vector is, in fact, an isometric embedding, thus proving that the $\ell^\infty$-cophenetic metric is, in fact, an interleaving distance.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2018
- DOI:
- 10.48550/arXiv.1803.07609
- arXiv:
- arXiv:1803.07609
- Bibcode:
- 2018arXiv180307609M
- Keywords:
-
- Computer Science - Computational Geometry;
- Mathematics - Category Theory