Adaptive Variance for Changing Sparse-Reward Environments

doi:10.48550/arXiv.1903.06309

Adaptive Variance for Changing Sparse-Reward Environments

Robots that are trained to perform a task in a fixed environment often fail when facing unexpected changes to the environment due to a lack of exploration. We propose a principled way to adapt the policy for better exploration in changing sparse-reward environments. Unlike previous works which explicitly model environmental changes, we analyze the relationship between the value function and the optimal exploration for a Gaussian-parameterized policy and show that our theory leads to an effective strategy for adjusting the variance of the policy, enabling fast adapt to changes in a variety of sparse-reward environments.

Publication:

arXiv e-prints

Pub Date:

March 2019

DOI:

10.48550/arXiv.1903.06309

arXiv:

arXiv:1903.06309

Bibcode:

2019arXiv190306309L

Keywords:

Computer Science - Robotics;
Computer Science - Artificial Intelligence

E-Print:

Accepted as a conference at International Conference on Robotics and Automation(ICRA) 2019

NASA/ADS

Adaptive Variance for Changing Sparse-Reward Environments

Abstract