Blindfold Baselines for Embodied QA
Abstract
We explore blindfold (question-only) baselines for Embodied Question Answering. The EmbodiedQA task requires an agent to answer a question by intelligently navigating in a simulated environment, gathering necessary visual information only through first-person vision before finally answering. Consequently, a blindfold baseline which ignores the environment and visual information is a degenerate solution, yet we show through our experiments on the EQAv1 dataset that a simple question-only baseline achieves state-of-the-art results on the EmbodiedQA task in all cases except when the agent is spawned extremely close to the object.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2018
- DOI:
- 10.48550/arXiv.1811.05013
- arXiv:
- arXiv:1811.05013
- Bibcode:
- 2018arXiv181105013A
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Artificial Intelligence;
- Computer Science - Computation and Language;
- Computer Science - Machine Learning
- E-Print:
- NIPS 2018 Visually-Grounded Interaction and Language (ViGilL) Workshop