Inference of Natural Selection from Interspersed Genomic Elements Based on Polymorphism and Divergence
Abstract
Complete genome sequences contain valuable information about natural selection, but extracting this information for short, widely scattered noncoding elements remains a challenging problem. Here we introduce a new computational method for addressing this problem called Inference of Natural Selection from Interspersed Genomically coHerent elemenTs (INSIGHT). INSIGHT uses a generative probabilistic model to contrast patterns of polymorphism and divergence in the elements of interest with those in flanking neutral sites, pooling weak information from many short elements in a manner that accounts for variation among loci in mutation rates and genealogical backgrounds. The method is able to disentangle the contributions of weak negative, strong negative, and positive selection based on their distinct effects on patterns of polymorphism and divergence. Information about divergence is obtained from multiple outgroup genomes using a full phylogenetic model. The model is efficiently fitted to genome-wide data by decomposing the maximum likelihood estimation procedure into three straightforward stages. The key selection-related parameters are estimated by expectation maximization. Using simulations, we show that INSIGHT can accurately estimate several parameters of interest even in complex demographic scenarios. We apply our methods to noncoding RNAs, promoter regions, and transcription factor binding sites in the human genome, and find clear evidence of natural selection. We also present a detailed analysis of particular nucleotide positions within GATA2 binding sites and primary micro-RNA transcripts.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2011
- DOI:
- 10.48550/arXiv.1109.6381
- arXiv:
- arXiv:1109.6381
- Bibcode:
- 2011arXiv1109.6381G
- Keywords:
-
- Quantitative Biology - Genomics
- E-Print:
- 21 page manuscript, 4 figure, 4 tables + 3 supp figures + 3 supp tables + supp methods. V4: additional results on human noncoding RNAs annotated by GENCODE + refinement of previous versions + additional supplementary material included to main document. V5: some minor modifications. V6: this is an electronic version of an article published in Mol Biol Evol, 2013