Clustering of SNPs along a chromosome: can the neutral model be rejected?
Abstract
Single nucleotide polymorphisms (SNPs) often appear in clusters along the length of a chromosome. This is due to variation in local coalescent times caused by,for example, selection or recombination. Here we investigate whether recombination alone (within a neutral model) can cause statistically significant SNP clustering. We measure the extent of SNP clustering as the ratio between the variance of SNPs found in bins of length $l$, and the mean number of SNPs in such bins, $\sigma^2_l/\mu_l$. For a uniform SNP distribution $\sigma^2_l/\mu_l=1$, for clustered SNPs $\sigma^2_l/\mu_l > 1$. Apart from the bin length, three length scales are important when accounting for SNP clustering: The mean distance between neighboring SNPs, $\Delta$, the mean length of chromosome segments with constant time to the most recent common ancestor, $\el$, and the total length of the chromosome, $L$. We show that SNP clustering is observed if $\Delta < \el \ll L$. Moreover, if $l\ll \el \ll L$, clustering becomes independent of the rate of recombination. We apply our results to the analysis of SNP data sets from mice, and human chromosomes 6 and X. Of the three data sets investigated, the human X chromosome displays the most significant deviation from neutrality.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2002
- DOI:
- arXiv:
- arXiv:physics/0207024
- Bibcode:
- 2002physics...7024E
- Keywords:
-
- Physics - Biological Physics;
- Quantitative Biology
- E-Print:
- 17 pages, 5 figures