SurveySum: A Dataset for Summarizing Multiple Scientific Articles into a Survey Section
Abstract
Document summarization is a task to shorten texts into concise and informative summaries. This paper introduces a novel dataset designed for summarizing multiple scientific articles into a section of a survey. Our contributions are: (1) SurveySum, a new dataset addressing the gap in domain-specific summarization tools; (2) two specific pipelines to summarize scientific articles into a section of a survey; and (3) the evaluation of these pipelines using multiple metrics to compare their performance. Our results highlight the importance of high-quality retrieval stages and the impact of different configurations on the quality of generated summaries.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2024
- DOI:
- arXiv:
- arXiv:2408.16444
- Bibcode:
- 2024arXiv240816444C
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- 15 pages, 6 figures, 1 table. Submitted to BRACIS 2024