Power laws for family sizes in a duplication model
Abstract
Qian, Luscombe and Gerstein [J. Molecular Biol. 313 (2001) 673--681] introduced a model of the diversification of protein folds in a genome that we may formulate as follows. Consider a multitype Yule process starting with one individual in which there are no deaths and each individual gives birth to a new individual at rate 1. When a new individual is born, it has the same type as its parent with probability $1-r$ and is a new type, different from all previously observed types, with probability $r$. We refer to individuals with the same type as families and provide an approximation to the joint distribution of family sizes when the population size reaches $N$. We also show that if $1\ll S\ll N^{1-r}$, then the number of families of size at least $S$ is approximately $CNS^{-1/(1-r)}$, while if $N^{1-r}\ll S$ the distribution decays more rapidly than any power.
- Publication:
-
arXiv Mathematics e-prints
- Pub Date:
- June 2004
- DOI:
- arXiv:
- arXiv:math/0406216
- Bibcode:
- 2004math......6216D
- Keywords:
-
- Mathematics - Probability;
- Quantitative Biology - Populations and Evolution;
- 60J80 (Primary) 60J85;
- 92D15;
- 92D20 (Secondary)
- E-Print:
- Published at http://dx.doi.org/10.1214/009117905000000369 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)