Sublinear Growth of Information in DNA Sequences
Abstract
We introduce a novel method to analyse complete genomes and recognise some distinctive features by means of an adaptive compression algorithm, which is not DNA-oriented. We study the Information Content as a function of the number of symbols encoded by the algorithm. Preliminar results are shown concerning regions having a sublinear type of information growth, which is strictly connected to the presence of highly repetitive subregions that might be supposed to have a regulatory function within the genome.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2004
- DOI:
- arXiv:
- arXiv:q-bio/0402046
- Bibcode:
- 2004q.bio.....2046M
- Keywords:
-
- Quantitative Biology - Genomics;
- Condensed Matter - Statistical Mechanics;
- Physics - Data Analysis;
- Statistics and Probability
- E-Print:
- 30 pages, 13 figures, submitted (Oct. 2003)