Tree split probabilities determine the branch lengths
Abstract
The evolution of aligned DNA sequence sites is generally modeled by a Markov process operating along the edges of a phylogenetic tree. It is well known that the probability distribution on the site patterns at the tips of the tree determines the tree and its branch lengths. However, the number of patterns is typically much larger than the number of edges, suggesting considerable redundancy in the branch length estimation. In this paper we ask whether the probabilities of just the `edge-specific' patterns (the ones that correspond to a change of state on a single edge) suffice to recover the branch lengths of the tree, under a symmetric 2-state Markov process. We first show that this holds provided the branch lengths are sufficiently short, by applying the inverse function theorem. We then consider whether this restriction to short branch lengths is necessary, and show that for trees with up to four leaves it can be lifted. This leaves open the interesting question of whether this holds in general.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2013
- DOI:
- 10.48550/arXiv.1310.3316
- arXiv:
- arXiv:1310.3316
- Bibcode:
- 2013arXiv1310.3316C
- Keywords:
-
- Quantitative Biology - Populations and Evolution
- E-Print:
- 12 pages, 1 figure