Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions
Chargaff's second parity rules for mononucleotides and oligonucleotides (CIImono and CIIoligo rules) state that a sufficiently long (>100 kb) strand of genomic DNA that contains N copies of a mono- or oligonucleotide, also contains N copies of its reverse complementary mono- or oligonucleotide on the same strand. There is very strong support in the literature for the validity of the rules in coding and noncoding regions, especially for the CIImono rule. Because the experimental support for the CIIoligo rule is much less complete, the present article, focusing on the special case of trinucleotides (triplets), examined several gigabases of genome sequences from a wide range of species and kingdoms including organelles such as mitochondria and chloroplasts. I found that all genomes, with the only exception of certain mitochondria, complied with the CIItriplet rule at a very high level of accuracy in coding and noncoding regions alike. Based on the growing evidence that genomes may contain up to millions of copies of interspersed repetitive elements, I propose in this article a quantitative formulation of the hypothesis that inversions and inverted transposition could be a major contributing if not dominant factor in the almost universal validity of the rules.