On the information content of discrete phylogenetic characters
Abstract
Phylogenetic inference aims to reconstruct the evolutionary relationships of different species based on genetic (or other) data. Discrete characters are a particular type of data, which contain information on how the species should be grouped together. However, it has long been known that some characters contain more information than others. For instance, a character that assigns the same state to each species groups all of them together and so provides no insight into the relationships of the species considered. At the other extreme, a character that assigns a different state to each species also conveys no phylogenetic signal. In this manuscript, we study a natural combinatorial measure of the information content of an individual character and analyse properties of characters that provide the maximum phylogenetic information, particularly, the number of states such a character uses and how the different states have to be distributed among the species or taxa of the phylogenetic tree.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2017
- DOI:
- 10.48550/arXiv.1703.04734
- arXiv:
- arXiv:1703.04734
- Bibcode:
- 2017arXiv170304734B
- Keywords:
-
- Quantitative Biology - Populations and Evolution
- E-Print:
- 16 pages, 7 figures. Final version has now appeared in Journal of Mathematical Biology (December 2017) https://link.springer.com/article/10.1007/s00285-017-1198-2