Exact inference under the perfect phylogeny model
Abstract
Motivation: Many inference tools use the Perfect Phylogeny Model (PPM) to learn trees from noisy variant allele frequency (VAF) data. Learning in this setting is hard, and existing tools use approximate or heuristic algorithms. An algorithmic improvement is important to help disentangle the limitations of the PPM's assumptions from the limitations in our capacity to learn under it. Results: We make such improvement in the scenario, where the mutations that are relevant for evolution can be clustered into a small number of groups, and the trees to be reconstructed have a small number of nodes. We use a careful combination of algorithms, software, and hardware, to develop EXACT: a tool that can explore the space of all possible phylogenetic trees, and performs exact inference under the PPM with noisy data. EXACT allows users to obtain not just the most-likely tree for some input data, but exact statistics about the distribution of trees that might explain the data. We show that EXACT outperforms several existing tools for this same task. Availability: https://github.com/surjray-repos/EXACT
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2019
- DOI:
- arXiv:
- arXiv:1908.08623
- Bibcode:
- 2019arXiv190808623R
- Keywords:
-
- Quantitative Biology - Quantitative Methods;
- Computer Science - Data Structures and Algorithms;
- Computer Science - Machine Learning