Contextual Networks and Unsupervised Ranking of Sentences
Abstract
We construct a contextual network to represent a document with syntactic and semantic relations between word-sentence pairs, based on which we devise an unsupervised algorithm called CNATAR (Contextual Network And Text Analysis Rank) to score sentences, and rank them through a bi-objective 0-1 knapsack maximization problem over topic analysis and sentence scores. We show that CNATAR outperforms the combined ranking of the three human judges provided on the SummBank dataset under both ROUGE and BLEU metrics, which in term significantly outperforms each individual judge's ranking. Moreover, CNATAR produces so far the highest ROUGE scores over DUC-02, and outperforms previous supervised algorithms on the CNN/DailyMail and NYT datasets. We also compare the performance of CNATAR and the latest supervised neural-network summarization models and compute oracle results.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2022
- DOI:
- 10.48550/arXiv.2203.04459
- arXiv:
- arXiv:2203.04459
- Bibcode:
- 2022arXiv220304459Z
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Machine Learning
- E-Print:
- ICTAI 2021