SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

doi:10.48550/arXiv.1901.01808

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning. We devise, implement, and evaluate a system, called SequenceR, for fixing bugs based on sequence-to-sequence learning on source code. This approach uses the copy mechanism to overcome the unlimited vocabulary problem that occurs with big code. Our system is data-driven; we train it on 35,578 samples, carefully curated from commits to open-source repositories. We evaluate it on 4,711 independent real bug fixes, as well on the Defects4J benchmark used in program repair research. SequenceR is able to perfectly predict the fixed line for 950/4711 testing samples, and find correct patches for 14 bugs in Defects4J. It captures a wide range of repair operators without any domain-specific top-down design.

Publication:

arXiv e-prints

Pub Date:

December 2018

DOI:

10.48550/arXiv.1901.01808

arXiv:

arXiv:1901.01808

Bibcode:

2019arXiv190101808C

Keywords:

Computer Science - Software Engineering;
Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

21 pages, 15 figures

NASA/ADS

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

Abstract