The asymptotic distribution of the $k$-Robinson-Foulds dissimilarity measure on labelled trees
Abstract
Motivated by applications in medical bioinformatics, Khayatian et al. (2024) introduced a family of metrics on Cayley trees (the $k$-RF distance, for $k=0, \ldots, n-2$) and explored their distribution on pairs of random Cayley trees via simulations. In this paper, we investigate this distribution mathematically, and derive exact asymptotic descriptions of the distribution of the $k$-RF metric for the extreme values $k=0$ and $k=n-2$, as $n$ becomes large. We show that a linear transform of the $0$-RF metric converges to a Poisson distribution (with mean 2) whereas a similar transform for the $(n-2)$-RF metric leads to a normal distribution (with mean $\sim ne^{-2}$). These results (together with the case $k=1$ which behaves quite differently, and $k=n-3$) shed light on the earlier simulation results, and the predictions made concerning them.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- DOI:
- arXiv:
- arXiv:2412.20012
- Bibcode:
- 2024arXiv241220012F
- Keywords:
-
- Mathematics - Probability;
- Mathematics - Combinatorics;
- Quantitative Biology - Populations and Evolution;
- 05A16 05C05
- E-Print:
- 16 pages, 2 figures