On the uniqueness of the maximum parsimony tree for data with few substitutions within the NNI neighborhood
Abstract
Estimating species relationship trees, so-called phylogenetic trees, from aligned sequence data (such as DNA, RNA, or proteins) is one of the main aims of evolutionary biology. However, tree reconstruction criteria like maximum parsimony do not necessarily lead to unique trees and in some cases even fail to recognize the \enquote{correct} tree (i.e., the tree on which the data was generated). On the other hand, a recent study has shown that for an alignment containing precisely those characters (sites) which require up to two substitutions on a given tree, this tree will be the unique maximum parsimony tree. It is the aim of the present manuscript to generalize this recent result in the following sense: We show that for a tree with $n$ leaves, as long as $k< \frac{n}{8}+\frac{6}{5}-\frac{1}{10} \sqrt{\frac{5}{16} n^2+4}$ (or, equivalently, $n>8 k-\frac{46}{5}+\frac{2}{5} \sqrt{40 k-31} $), the maximum parsimony tree for the alignment containing all characters which require (up to or precisely) $k$ substitutions on a given tree $T$ will be unique in the NNI neighborhood of $T$ and it will coincide with $T$, too. In other words, within the NNI neighborhood of $T$, $T$ is the unique most parsimonious tree for said alignment. This partially answers a recently published conjecture affirmatively.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2024
- DOI:
- 10.48550/arXiv.2403.01282
- arXiv:
- arXiv:2403.01282
- Bibcode:
- 2024arXiv240301282F
- Keywords:
-
- Quantitative Biology - Populations and Evolution;
- Mathematics - Combinatorics;
- 05C05;
- 05C90;
- 92B05
- E-Print:
- 17 pages, 4 figures