Understanding long-range correlations in DNA sequences
Abstract
In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed {1}/{f}-like spectral component. We note the complexity of the correlation structure in DNA sequences. The observed complexity often makes it hard, or impossible, to decompose the sequence into a few statistically stationary regions. We suggest that, based on the complexity of DNA sequences, a fruitful approach to understand long-range correlation is to model duplication, and other rearrangement processes, in DNA sequences. One model, called “expansion-modification system”, contains only point duplication and point mutation. Though simplistic, this model is able to generate sequences with {1}/{f} spectra. We emphasize the importance of DNA duplication in its contribution to the observed long-range correlation in DNA sequences.
- Publication:
-
Physica D Nonlinear Phenomena
- Pub Date:
- August 1994
- DOI:
- 10.1016/0167-2789(94)90294-1
- arXiv:
- arXiv:chao-dyn/9403002
- Bibcode:
- 1994PhyD...75..392L
- Keywords:
-
- Nonlinear Sciences - Chaotic Dynamics;
- Quantitative Biology - Genomics
- E-Print:
- a latex file, a macro file (ccsr.sty) to run latex, and a figures.uu file which contains 17 postscript figures. the text should contain 29 pages. To be published in Physica D (1994)