Constructing Hierarchical Q&amp;A Datasets for Video Story Understanding

doi:10.48550/arXiv.1904.00623

Constructing Hierarchical Q&A Datasets for Video Story Understanding

Video understanding is emerging as a new paradigm for studying human-like AI. Question-and-Answering (Q&A) is used as a general benchmark to measure the level of intelligence for video understanding. While several previous studies have suggested datasets for video Q&A tasks, they did not really incorporate story-level understanding, resulting in highly-biased and lack of variance in degree of question difficulty. In this paper, we propose a hierarchical method for building Q&A datasets, i.e. hierarchical difficulty levels. We introduce three criteria for video story understanding, i.e. memory capacity, logical complexity, and DIKW (Data-Information-Knowledge-Wisdom) pyramid. We discuss how three-dimensional map constructed from these criteria can be used as a metric for evaluating the levels of intelligence relating to video story understanding.

Publication:

arXiv e-prints

Pub Date:

April 2019

DOI:

10.48550/arXiv.1904.00623

arXiv:

arXiv:1904.00623

Bibcode:

2019arXiv190400623H

Keywords:

Computer Science - Artificial Intelligence;
Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Machine Learning;
Computer Science - Multimedia

E-Print:

Accepted to AAAI 2019 Spring Symposium Series : Story-Enabled Intelligence

NASA/ADS

Constructing Hierarchical Q&A Datasets for Video Story Understanding

Abstract