QUOTA: The Quantile Option Architecture for Reinforcement Learning

doi:10.48550/arXiv.1811.02073

QUOTA: The Quantile Option Architecture for Reinforcement Learning

In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.

Publication:

arXiv e-prints

Pub Date:

November 2018

DOI:

10.48550/arXiv.1811.02073

arXiv:

arXiv:1811.02073

Bibcode:

2018arXiv181102073Z

Keywords:

Computer Science - Machine Learning;
Computer Science - Artificial Intelligence

E-Print:

AAAI 2019

NASA/ADS

QUOTA: The Quantile Option Architecture for Reinforcement Learning

Abstract