Extraction of Salient Sentences from Labelled Documents
Abstract
We present a hierarchical convolutional document model with an architecture designed to support introspection of the document structure. Using this model, we show how to use visualisation techniques from the computer vision literature to identify and extract topic-relevant sentences. We also introduce a new scalable evaluation technique for automatic sentence extraction systems that avoids the need for time consuming human annotation of validation data.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2014
- DOI:
- 10.48550/arXiv.1412.6815
- arXiv:
- arXiv:1412.6815
- Bibcode:
- 2014arXiv1412.6815D
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Information Retrieval;
- Computer Science - Machine Learning
- E-Print:
- arXiv admin note: substantial text overlap with arXiv:1406.3830