Generating Wikipedia by Summarizing Long Sequences
Abstract
We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents. We use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. For the abstractive model, we introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction. We show that this model can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. When given reference documents, we show it can extract relevant factual information as reflected in perplexity, ROUGE scores and human evaluations.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2018
- DOI:
- 10.48550/arXiv.1801.10198
- arXiv:
- arXiv:1801.10198
- Bibcode:
- 2018arXiv180110198L
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- Published as a conference paper at ICLR 2018