Feedback GAN (FBGAN) for DNA: a Novel Feedback-Loop Architecture for Optimizing Protein Functions
Abstract
Generative Adversarial Networks (GANs) represent an attractive and novel approach to generate realistic data, such as genes, proteins, or drugs, in synthetic biology. Here, we apply GANs to generate synthetic DNA sequences encoding for proteins of variable length. We propose a novel feedback-loop architecture, called Feedback GAN (FBGAN), to optimize the synthetic gene sequences for desired properties using an external function analyzer. The proposed architecture also has the advantage that the analyzer need not be differentiable. We apply the feedback-loop mechanism to two examples: 1) generating synthetic genes coding for antimicrobial peptides, and 2) optimizing synthetic genes for the secondary structure of their resulting peptides. A suite of metrics demonstrate that the GAN generated proteins have desirable biophysical properties. The FBGAN architecture can also be used to optimize GAN-generated datapoints for useful properties in domains beyond genomics.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2018
- DOI:
- 10.48550/arXiv.1804.01694
- arXiv:
- arXiv:1804.01694
- Bibcode:
- 2018arXiv180401694G
- Keywords:
-
- Quantitative Biology - Genomics;
- Computer Science - Machine Learning;
- Computer Science - Neural and Evolutionary Computing