Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

doi:10.48550/arXiv.1805.12114

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation. Our comparison to state-of-the-art model-based and model-free deep RL algorithms shows that our approach matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples (e.g., 8 and 125 times fewer samples than Soft Actor Critic and Proximal Policy Optimization respectively on the half-cheetah task).

Publication:

arXiv e-prints

Pub Date:

May 2018

DOI:

10.48550/arXiv.1805.12114

arXiv:

arXiv:1805.12114

Bibcode:

2018arXiv180512114C

Keywords:

Computer Science - Machine Learning;
Computer Science - Artificial Intelligence;
Computer Science - Robotics;
Statistics - Machine Learning

E-Print:

NIPS 2018, video and code available at https://sites.google.com/view/drl-in-a-handful-of-trials/

NASA/ADS

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Abstract