End-to-End Content and Plan Selection for Data-to-Text Generation

doi:10.48550/arXiv.1810.04700

End-to-End Content and Plan Selection for Data-to-Text Generation

Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG. This problem can be challenging when the form of the structured data varies between examples. This paper presents a survey of several extensions to sequence-to-sequence models to account for the latent content selection process, particularly variants of copy attention and coverage decoding. We further propose a training method based on diverse ensembling to encourage models to learn distinct sentence templates during training. An empirical evaluation of these techniques shows an increase in the quality of generated text across five automated metrics, as well as human evaluation.

Publication:

arXiv e-prints

Pub Date:

October 2018

DOI:

10.48550/arXiv.1810.04700

arXiv:

arXiv:1810.04700

Bibcode:

2018arXiv181004700G

Keywords:

Computer Science - Computation and Language;
Computer Science - Artificial Intelligence

E-Print:

INLG 2018

NASA/ADS

End-to-End Content and Plan Selection for Data-to-Text Generation

Abstract