Simulation-Aided Policy Tuning for Black-Box Robot Learning

doi:10.48550/arXiv.2411.14246

Simulation-Aided Policy Tuning for Black-Box Robot Learning

How can robots learn and adapt to new tasks and situations with little data? Systematic exploration and simulation are crucial tools for efficient robot learning. We present a novel black-box policy search algorithm focused on data-efficient policy improvements. The algorithm learns directly on the robot and treats simulation as an additional information source to speed up the learning process. At the core of the algorithm, a probabilistic model learns the dependence of the policy parameters and the robot learning objective not only by performing experiments on the robot, but also by leveraging data from a simulator. This substantially reduces interaction time with the robot. Using this model, we can guarantee improvements with high probability for each policy update, thereby facilitating fast, goal-oriented learning. We evaluate our algorithm on simulated fine-tuning tasks and demonstrate the data-efficiency of the proposed dual-information source optimization algorithm. In a real robot learning experiment, we show fast and successful task learning on a robot manipulator with the aid of an imperfect simulator.

Publication:

arXiv e-prints

Pub Date:

November 2024

DOI:

10.48550/arXiv.2411.14246

arXiv:

arXiv:2411.14246

Bibcode:

2024arXiv241114246H

Keywords:

Computer Science - Robotics;
Computer Science - Machine Learning;
Electrical Engineering and Systems Science - Systems and Control

NASA/ADS

Simulation-Aided Policy Tuning for Black-Box Robot Learning

Abstract