Comparison of amino acid occurrence and composition for predicting protein folds
Abstract
Background:Prediction of protein three-dimensional structures from amino acid sequences is a long-standing goal in computational/molecular biology. The successful discrimination of protein folds would help to improve the accuracy of protein 3D structure prediction. Results: In this work, we propose a method based on linear discriminant analysis (LDA) for recognizing proteins belonging to 30 different folds using the occurrence of amino acid residues in a set of 1612 proteins. The present method could discriminate the globular proteins from 30 major folding types with the sensitivity of 37%, which is comparable to or better than other methods in the literature. A web server has been developed for predicting the folding type of the protein from amino acid sequence and it is available at http://granular.com/PROLDA/. Conclusions:Linear discriminant analysis based on amino acid occurrence could successfully recognize protein folds. The present method has several advantages such as, (i) it directly predicts the folding type of a protein without performing pair-wise comparisons, (ii) it can discriminate folds among large number of proteins and (iii) it is very fast to obtain the results. This is a simple method, which can be easily incorporated in any other structure prediction algorithms.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2006
- DOI:
- 10.48550/arXiv.q-bio/0609037
- arXiv:
- arXiv:q-bio/0609037
- Bibcode:
- 2006q.bio.....9037T
- Keywords:
-
- Quantitative Biology - Biomolecules;
- Condensed Matter - Soft Condensed Matter;
- Nonlinear Sciences - Adaptation and Self-Organizing Systems;
- Quantitative Biology - Quantitative Methods