Phylogenetic information complexity: Is testing a tree easier than finding it?
Abstract
Phylogenetic trees describe the evolutionary history of a group of present-day species from a common ancestor. These trees are typically reconstructed from aligned DNA sequence data. In this paper we analytically address the following question: Is the amount of sequence data required to accurately reconstruct a tree significantly more than the amount required to test whether or not a candidate tree was the 'true' tree? By 'significantly', we mean that the two quantities do not behave the same way as a function of the number of species being considered. We prove that, for a certain type of model, the amount of information required is not significantly different; while for another type of model, the information required to test a tree is independent of the number of leaves, while that required to reconstruct it grows with this number. Our results combine probabilistic and combinatorial arguments.
- Publication:
-
Journal of Theoretical Biology
- Pub Date:
- 2009
- DOI:
- 10.1016/j.jtbi.2009.01.007
- arXiv:
- arXiv:0807.1756
- Bibcode:
- 2009JThBi.258...95S
- Keywords:
-
- Phylogenetic tree;
- Information content;
- Sequence length;
- Reconstruction;
- Quantitative Biology - Populations and Evolution;
- Quantitative Biology - Quantitative Methods
- E-Print:
- 15 pages, 3 figures