CAMON: Cooperative Agents for Multi-Object Navigation with LLM-based Conversations
Abstract
Visual navigation tasks are critical for household service robots. As these tasks become increasingly complex, effective communication and collaboration among multiple robots become imperative to ensure successful completion. In recent years, large language models (LLMs) have exhibited remarkable comprehension and planning abilities in the context of embodied agents. However, their application in household scenarios, specifically in the use of multiple agents collaborating to complete complex navigation tasks through communication, remains unexplored. Therefore, this paper proposes a framework for decentralized multi-agent navigation, leveraging LLM-enabled communication and collaboration. By designing the communication-triggered dynamic leadership organization structure, we achieve faster team consensus with fewer communication instances, leading to better navigation effectiveness and collaborative exploration efficiency. With the proposed novel communication scheme, our framework promises to be conflict-free and robust in multi-object navigation tasks, even when there is a surge in team size.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2024
- DOI:
- 10.48550/arXiv.2407.00632
- arXiv:
- arXiv:2407.00632
- Bibcode:
- 2024arXiv240700632W
- Keywords:
-
- Computer Science - Robotics;
- Computer Science - Computation and Language;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Multiagent Systems
- E-Print:
- Accepted to the RSS 2024 Workshop: GROUND