The site frequency spectrum for general coalescents
Abstract
General genealogical processes such as $\Lambda$ and $\Xi$coalescents, which respectively model multiple and simultaneous mergers, have important applications in studying marine species, strong positive selection, recurrent selective sweeps, strong bottlenecks, large sample sizes, and so on. Recently, there has been significant progress in developing useful inference tools for such general models. In particular, inference methods based on the site frequency spectrum (SFS) have received noticeable attention. Here, we derive a new formula for the expected SFS for general $\Lambda$ and $\Xi$coalescents, which leads to an efficient algorithm. For timehomogeneous coalescents, the runtime of our algorithm for computing the expected SFS is $O(n^2)$, where $n$ is the sample size. This is a factor of $n^2$ faster than the stateoftheart method. Furthermore, in contrast to existing methods, our method generalizes to timeinhomogeneous $\Lambda$ and $\Xi$coalescents with measures that factorize as $\Lambda(dx)/\zeta(t)$ and $\Xi(dx)/\zeta(t)$, respectively, where $\zeta$ denotes a strictly positive function of time. The runtime of our algorithm in this setting is $O(n^3)$. We also obtain general theoretical results for the identifiability of the $\Lambda$ measure when $\zeta$ is a constant function, as well as for the identifiability of the function $\zeta$ under a fixed $\Xi$ measure.
 Publication:

arXiv eprints
 Pub Date:
 October 2015
 arXiv:
 arXiv:1510.05631
 Bibcode:
 2015arXiv151005631S
 Keywords:

 Quantitative Biology  Populations and Evolution;
 Mathematics  Probability
 EPrint:
 20 pages, 4 figure