Comparing Nonparametric Bayesian Tree Priors for Clonal Reconstruction of Tumors
Abstract
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2014
- DOI:
- arXiv:
- arXiv:1408.2552
- Bibcode:
- 2014arXiv1408.2552D
- Keywords:
-
- Quantitative Biology - Populations and Evolution;
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- Preprint of an article submitted for consideration in the Pacific Symposium on Biocomputing \c{opyright} 2015