On Adversarial Robustness of Synthetic Code Generation

doi:10.48550/arXiv.2106.11629

On Adversarial Robustness of Synthetic Code Generation

Automatic code synthesis from natural language descriptions is a challenging task. We witness massive progress in developing code generation systems for domain-specific languages (DSLs) employing sequence-to-sequence deep learning techniques in the recent past. In this paper, we specifically experiment with \textsc{AlgoLisp} DSL-based generative models and showcase the existence of significant dataset bias through different classes of adversarial examples. We also experiment with two variants of Transformer-based models that outperform all existing \textsc{AlgoLisp} DSL-based code generation baselines. Consistent with the current state-of-the-art systems, our proposed models, too, achieve poor performance under adversarial settings. Therefore, we propose several dataset augmentation techniques to reduce bias and showcase their efficacy using robust experimentation.

Publication:

arXiv e-prints

Pub Date:

June 2021

DOI:

10.48550/arXiv.2106.11629

arXiv:

arXiv:2106.11629

Bibcode:

2021arXiv210611629A

Keywords:

Computer Science - Machine Learning;
Computer Science - Software Engineering

NASA/ADS

On Adversarial Robustness of Synthetic Code Generation

Abstract