Boosting Docking-based Virtual Screening with Deep Learning
Abstract
In this work, we propose a deep learning approach to improve docking-based virtual screening. The introduced deep neural network, DeepVS, uses the output of a docking program and learns how to extract relevant features from basic data such as atom and residues types obtained from protein-ligand complexes. Our approach introduces the use of atom and amino acid embeddings and implements an effective way of creating distributed vector representations of protein-ligand complexes by modeling the compound as a set of atom contexts that is further processed by a convolutional layer. One of the main advantages of the proposed method is that it does not require feature engineering. We evaluate DeepVS on the Directory of Useful Decoys (DUD), using the output of two docking programs: AutodockVina1.1.2 and Dock6.6. Using a strict evaluation with leave-one-out cross-validation, DeepVS outperforms the docking programs in both AUC ROC and enrichment factor. Moreover, using the output of AutodockVina1.1.2, DeepVS achieves an AUC ROC of 0.81, which, to the best of our knowledge, is the best AUC reported so far for virtual screening using the 40 receptors from DUD.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2016
- DOI:
- 10.48550/arXiv.1608.04844
- arXiv:
- arXiv:1608.04844
- Bibcode:
- 2016arXiv160804844C
- Keywords:
-
- Quantitative Biology - Quantitative Methods
- E-Print:
- The final version of this manuscript will be published in the Journal of Chemical Information and Modeling