Empowerment-driven Exploration using Mutual Information Estimation
Abstract
Exploration is a difficult challenge in reinforcement learning and is of prime importance in sparse reward environments. However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments. In such cases, empowerment can serve as an intrinsic reward signal to enable the agent to maximize the influence it has over the near future. We formulate empowerment as the channel capacity between states and actions and is calculated by estimating the mutual information between the actions and the following states. The mutual information is estimated using Mutual Information Neural Estimator and a forward dynamics model. We demonstrate that an empowerment driven agent is able to improve significantly the score of a baseline DQN agent on the game of Montezuma's Revenge.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2018
- DOI:
- arXiv:
- arXiv:1810.05533
- Bibcode:
- 2018arXiv181005533M
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence;
- Statistics - Machine Learning
- E-Print:
- Preprint. Under Development. arXiv admin note: text overlap with arXiv:1807.02078 by other authors