Structure-based Drug Design with Equivariant Diffusion Models
Abstract
Structure-based drug design (SBDD) aims to design small-molecule ligands that bind with high affinity and specificity to pre-determined protein targets. Generative SBDD methods leverage structural data of drugs in complex with their protein targets to propose new drug candidates. These approaches typically place one atom at a time in an autoregressive fashion using the binding pocket as well as previously added ligand atoms as context in each step. Recently a surge of diffusion generative models has entered this domain which hold promise to capture the statistical properties of natural ligands more faithfully. However, most existing methods focus exclusively on bottom-up de novo design of compounds or tackle other drug development challenges with task-specific models. The latter requires curation of suitable datasets, careful engineering of the models and retraining from scratch for each task. Here we show how a single pre-trained diffusion model can be applied to a broader range of problems, such as off-the-shelf property optimization, explicit negative design, and partial molecular design with inpainting. We formulate SBDD as a 3D-conditional generation problem and present DiffSBDD, an SE(3)-equivariant diffusion model that generates novel ligands conditioned on protein pockets. Our in silico experiments demonstrate that DiffSBDD captures the statistics of the ground truth data effectively. Furthermore, we show how additional constraints can be used to improve the generated drug candidates according to a variety of computational metrics. These results support the assumption that diffusion models represent the complex distribution of structural data more accurately than previous methods, and are able to incorporate additional design objectives and constraints changing nothing but the sampling strategy.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2022
- DOI:
- 10.48550/arXiv.2210.13695
- arXiv:
- arXiv:2210.13695
- Bibcode:
- 2022arXiv221013695S
- Keywords:
-
- Quantitative Biology - Biomolecules;
- Computer Science - Machine Learning