Dyna-AIL : Adversarial Imitation Learning by Planning
Abstract
Adversarial methods for imitation learning have been shown to perform well on various control tasks. However, they require a large number of environment interactions for convergence. In this paper, we propose an end-to-end differentiable adversarial imitation learning algorithm in a Dyna-like framework for switching between model-based planning and model-free learning from expert data. Our results on both discrete and continuous environments show that our approach of using model-based planning along with model-free learning converges to an optimal policy with fewer number of environment interactions in comparison to the state-of-the-art learning methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2019
- DOI:
- 10.48550/arXiv.1903.03234
- arXiv:
- arXiv:1903.03234
- Bibcode:
- 2019arXiv190303234S
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence;
- Statistics - Machine Learning
- E-Print:
- 8 pages, 6 figures, pre-print