Predicting protein secondary structure with Neural Machine Translation
Abstract
We present analysis of a novel tool for protein secondary structure prediction using the recently-investigated Neural Machine Translation framework. The tool provides a fast and accurate folding prediction based on primary structure with subsecond prediction time even for batched inputs. We hypothesize that Neural Machine Translation can improve upon current predictive accuracy by better encoding complex relationships between nearby but non-adjacent amino acids. We overview our modifications to the framework in order to improve accuracy on protein sequences. We report 65.9% Q3 accuracy and analyze the strengths and weaknesses of our predictive model.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2018
- DOI:
- 10.48550/arXiv.1809.09210
- arXiv:
- arXiv:1809.09210
- Bibcode:
- 2018arXiv180909210W
- Keywords:
-
- Quantitative Biology - Quantitative Methods;
- Quantitative Biology - Biomolecules
- E-Print:
- 9 pages, 9 figures, 2 tables