Kinova Gemini: Interactive Robot Grasping with Visual Reasoning and Conversational AI

doi:10.48550/arXiv.2209.01319

Kinova Gemini: Interactive Robot Grasping with Visual Reasoning and Conversational AI

To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemini is able to fulfill the user's requests in three different applications: (1) It can start a natural dialogue with people to interact and assist humans to retrieve objects and hand them to the user one by one. (2) It detects diverse objects with YOLO v3 and recognize color attributes of the item to ask people if they want to grasp it via the dialogue or enable the user to choose which specific one is required. (3) It applies YOLO v3 to recognize multiple objects and let you choose two items for perception-based pick-and-place tasks such as "Put the banana into the bowl" with visual reasoning and conversational interaction.

Publication:

arXiv e-prints

Pub Date:

September 2022

DOI:

10.48550/arXiv.2209.01319

arXiv:

arXiv:2209.01319

Bibcode:

2022arXiv220901319C

Keywords:

Computer Science - Robotics;
Computer Science - Human-Computer Interaction

NASA/ADS

Kinova Gemini: Interactive Robot Grasping with Visual Reasoning and Conversational AI

Abstract