Crossing New Frontiers: Knowledge-Augmented Large Language Model Prompting for Zero-Shot Text-Based De Novo Molecule Design
Abstract
Molecule design is a multifaceted approach that leverages computational methods and experiments to optimize molecular properties, fast-tracking new drug discoveries, innovative material development, and more efficient chemical processes. Recently, text-based molecule design has emerged, inspired by next-generation AI tasks analogous to foundational vision-language models. Our study explores the use of knowledge-augmented prompting of large language models (LLMs) for the zero-shot text-conditional de novo molecular generation task. Our approach uses task-specific instructions and a few demonstrations to address distributional shift challenges when constructing augmented prompts for querying LLMs to generate molecules consistent with technical descriptions. Our framework proves effective, outperforming state-of-the-art (SOTA) baseline models on benchmark datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2024
- DOI:
- 10.48550/arXiv.2408.11866
- arXiv:
- arXiv:2408.11866
- Bibcode:
- 2024arXiv240811866S
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning;
- Quantitative Biology - Biomolecules
- E-Print:
- Paper was accepted at R0-FoMo: Robustness of Few-shot and Zero-shot Learning in Foundation Models, NeurIPS-2023. Please find the links: https://sites.google.com/view/r0-fomo/accepted-papers?authuser=0 and https://neurips.cc/virtual/2023/workshop/66517