Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using Large Language Models
Abstract
In advanced human-robot interaction tasks, visual target navigation is crucial for autonomous robots navigating unknown environments. While numerous approaches have been developed in the past, most are designed for single-robot operations, which often suffer from reduced efficiency and robustness due to environmental complexities. Furthermore, learning policies for multi-robot collaboration are resource-intensive. To address these challenges, we propose Co-NavGPT, an innovative framework that integrates Large Language Models (LLMs) as a global planner for multi-robot cooperative visual target navigation. Co-NavGPT encodes the explored environment data into prompts, enhancing LLMs' scene comprehension. It then assigns exploration frontiers to each robot for efficient target search. Experimental results on Habitat-Matterport 3D (HM3D) demonstrate that Co-NavGPT surpasses existing models in success rates and efficiency without any learning process, demonstrating the vast potential of LLMs in multi-robot collaboration domains. The supplementary video, prompts, and code can be accessed via the following link: https://sites.google.com/view/co-navgpt
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2023
- DOI:
- arXiv:
- arXiv:2310.07937
- Bibcode:
- 2023arXiv231007937Y
- Keywords:
-
- Computer Science - Robotics;
- Computer Science - Artificial Intelligence
- E-Print:
- 7 pages, 4 figures, conference