Eyes on the Game: Deciphering Implicit Human Signals to Infer Human Proficiency, Trust, and Intent
Abstract
Effective collaboration between humans and AIs hinges on transparent communication and alignment of mental models. However, explicit, verbal communication is not always feasible. Under such circumstances, human-human teams often depend on implicit, nonverbal cues to glean important information about their teammates such as intent and expertise, thereby bolstering team alignment and adaptability. Among these implicit cues, two of the most salient and fundamental are a human's actions in the environment and their visual attention. In this paper, we present a novel method to combine eye gaze data and behavioral data, and evaluate their respective predictive power for human proficiency, trust, and intent. We first collect a dataset of paired eye gaze and gameplay data in the fast-paced collaborative "Overcooked" environment. We then train models on this dataset to compare how the predictive powers differ between gaze data, gameplay data, and their combination. We additionally compare our method to prior works that aggregate eye gaze data and demonstrate how these aggregation methods can substantially reduce the predictive ability of eye gaze. Our results indicate that, while eye gaze data and gameplay data excel in different situations, a model that integrates both types consistently outperforms all baselines. This work paves the way for developing intuitive and responsive agents that can efficiently adapt to new teammates.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2407.03298
- arXiv:
- arXiv:2407.03298
- Bibcode:
- 2024arXiv240703298H
- Keywords:
-
- Computer Science - Human-Computer Interaction
- E-Print:
- 7 pages, 5 figures, To be published in The 33rd IEEE International Conference on Robot and Human Interactive Communication, IEEE RO-MAN 2024