BETAWRAP: Successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens
Abstract
The amino acid sequence rules that specify β-sheet structure in proteins remain obscure. A subclass of β-sheet proteins, parallel β-helices, represent a processive folding of the chain into an elongated topologically simpler fold than globular β-sheets. In this paper, we present a computational approach that predicts the right-handed parallel β-helix supersecondary structural motif in primary amino acid sequences by using β-strand interactions learned from non-β-helix structures. A program called BETAWRAP (http://theory.lcs.mit.edu/betawrap) implements this method and recognizes each of the seven known parallel β-helix families, when trained on the known parallel β-helices from outside that family. BETAWRAP identifies 2,448 sequences among 595,890 screened from the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/) nonredundant protein database as likely parallel β-helices. It identifies surprisingly many bacterial and fungal protein sequences that play a role in human infectious disease; these include toxins, virulence factors, adhesins, and surface proteins of Chlamydia, Helicobacteria, Bordetella, Leishmania, Borrelia, Rickettsia, Neisseria, and Bacillus anthracis. Also unexpected was the rarity of the parallel β-helix fold and its predicted sequences among higher eukaryotes. The computational method introduced here can be called a three-dimensional dynamic profile method because it generates interstrand pairwise correlations from a processive sequence wrap. Such methods may be applicable to recognizing other beta structures for which strand topology and profiles of residue accessibility are well conserved.
- Publication:
-
Proceedings of the National Academy of Science
- Pub Date:
- December 2001
- DOI:
- Bibcode:
- 2001PNAS...9814819B
- Keywords:
-
- Biochemistry