Multi-task Reinforcement Learning with a Planning Quasi-Metric
Abstract
We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of steps required to go from any state to another, with task-specific "aimers" that compute a target state to reach a given goal. This decomposition allows the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment's dynamics and can be learned in a dense and unsupervised manner. We achieve multiple-fold training speed-up compared to recently published methods on the standard bit-flip problem and in the MuJoCo robotic arm simulator.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2020
- DOI:
- 10.48550/arXiv.2002.03240
- arXiv:
- arXiv:2002.03240
- Bibcode:
- 2020arXiv200203240M
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- Deep RL Workshop, NeurIPS 2020