Underspecification in Scene Description-to-Depiction Tasks

doi:10.48550/arXiv.2210.05815

Underspecification in Scene Description-to-Depiction Tasks

Questions regarding implicitness, ambiguity and underspecification are crucial for understanding the task validity and ethical concerns of multimodal image+text systems, yet have received little attention to date. This position paper maps out a conceptual framework to address this gap, focusing on systems which generate images depicting scenes from scene descriptions. In doing so, we account for how texts and images convey meaning differently. We outline a set of core challenges concerning textual and visual ambiguity, as well as risks that may be amplified by ambiguous and underspecified elements. We propose and discuss strategies for addressing these challenges, including generating visually ambiguous images, and generating a set of diverse images.

Publication:

arXiv e-prints

Pub Date:

October 2022

DOI:

10.48550/arXiv.2210.05815

arXiv:

arXiv:2210.05815

Bibcode:

2022arXiv221005815H

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Computation and Language

NASA/ADS

Underspecification in Scene Description-to-Depiction Tasks

Abstract