Knowledge Enhanced Sports Game Summarization
Abstract
Sports game summarization aims at generating sports news from live commentaries. However, existing datasets are all constructed through automated collection and cleaning processes, resulting in a lot of noise. Besides, current works neglect the knowledge gap between live commentaries and sports news, which limits the performance of sports game summarization. In this paper, we introduce K-SportsSum, a new dataset with two characteristics: (1) K-SportsSum collects a large amount of data from massive games. It has 7,854 commentary-news pairs. To improve the quality, K-SportsSum employs a manual cleaning process; (2) Different from existing datasets, to narrow the knowledge gap, K-SportsSum further provides a large-scale knowledge corpus that contains the information of 523 sports teams and 14,724 sports players. Additionally, we also introduce a knowledge-enhanced summarizer that utilizes both live commentaries and the knowledge to generate sports news. Extensive experiments on K-SportsSum and SportsSum datasets show that our model achieves new state-of-the-art performances. Qualitative analysis and human study further verify that our model generates more informative sports news.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2021
- DOI:
- 10.48550/arXiv.2111.12535
- arXiv:
- arXiv:2111.12535
- Bibcode:
- 2021arXiv211112535W
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence
- E-Print:
- Accepted to WSDM 2022