Co-expression of statistically over-represented peptides in proteomes: a key to phylogeny ?
Abstract
It is proposed that the co-expression of statistically significant motifs among the sequences of a proteome is a phylogenetic trait. From the co-expression matrix of such motifs in a group of prokaryotic proteomes a suitable definition of a phylogenetic distance is introduced and the corresponding distance matrix between proteomes is constructed. From the distance matrix a phylogenetic tree is inferred, following a standard procedure. It compares well with a reference tree deduced from a distance matrix obtained from the alignment of ribosomal RNA sequences. Our results are consistent with the hypothesis that biological evolution manifests itself with a modulation of basic correlations between shared peptides of short length, present in protein sequences. Moreover, the simple procedure we propose reconfirms that it is possible, sampling entire proteomes, to average the effects of lateral gene transfer and infer reasonable phylogenies.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2004
- DOI:
- 10.48550/arXiv.q-bio/0410011
- arXiv:
- arXiv:q-bio/0410011
- Bibcode:
- 2004q.bio....10011F
- Keywords:
-
- Quantitative Biology - Molecular Networks;
- Quantitative Biology - Genomics;
- Quantitative Biology - Populations and Evolution
- E-Print:
- Minor typos corrected, partly rewritten conclusions and bibliography added