Protein Secondary Structure Prediction with Long Short Term Memory Networks
Abstract
Prediction of protein secondary structure from the amino acid sequence is a classical bioinformatics problem. Common methods use feed forward neural networks or SVMs combined with a sliding window, as these models does not naturally handle sequential data. Recurrent neural networks are an generalization of the feed forward neural network that naturally handle sequential data. We use a bidirectional recurrent neural network with long short term memory cells for prediction of secondary structure and evaluate using the CB513 dataset. On the secondary structure 8-class problem we report better performance (0.674) than state of the art (0.664). Our model includes feed forward networks between the long short term memory cells, a path that can be further explored.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2014
- DOI:
- 10.48550/arXiv.1412.7828
- arXiv:
- arXiv:1412.7828
- Bibcode:
- 2014arXiv1412.7828K
- Keywords:
-
- Quantitative Biology - Quantitative Methods;
- Computer Science - Machine Learning;
- Computer Science - Neural and Evolutionary Computing
- E-Print:
- v2: adds larger network with slightly better results, update author affiliations