SentiCap: Generating Image Descriptions with Sentiments
Abstract
The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2015
- DOI:
- 10.48550/arXiv.1510.01431
- arXiv:
- arXiv:1510.01431
- Bibcode:
- 2015arXiv151001431M
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Computation and Language;
- I.2.10;
- I.2.7;
- I.2.6