Optimizing Molecules using Efficient Queries from Property Evaluations
Abstract
Machine learning based methods have shown potential for optimizing existing molecules with more desirable properties, a critical step towards accelerating new chemical discovery. Here we propose QMO, a generic query-based molecule optimization framework that exploits latent embeddings from a molecule autoencoder. QMO improves the desired properties of an input molecule based on efficient queries, guided by a set of molecular property predictions and evaluation metrics. We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules for drug-likeness and solubility under similarity constraints. We also demonstrate significant property improvement using QMO on two new and challenging tasks that are also important in real-world discovery problems: (i) optimizing existing potential SARS-CoV-2 Main Protease inhibitors toward higher binding affinity; and (ii) improving known antimicrobial peptides towards lower toxicity. Results from QMO show high consistency with external validations, suggesting effective means to facilitate material optimization problems with design constraints.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2020
- DOI:
- 10.48550/arXiv.2011.01921
- arXiv:
- arXiv:2011.01921
- Bibcode:
- 2020arXiv201101921H
- Keywords:
-
- Computer Science - Machine Learning;
- Quantitative Biology - Biomolecules
- E-Print:
- Preprint version to be published at Nature Machine Intelligence