Active Reinforcement Learning: Observing Rewards at a Cost

doi:10.48550/arXiv.2011.06709

Active Reinforcement Learning: Observing Rewards at a Cost

Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward information. Even in multi-armed bandits, computing the value of this information is intractable and we have to rely on heuristics. We propose and evaluate several heuristic approaches for ARL in multi-armed bandits and (tabular) Markov decision processes, and discuss and illustrate some challenging aspects of the ARL problem.

Publication:

arXiv e-prints

Pub Date:

November 2020

DOI:

10.48550/arXiv.2011.06709

arXiv:

arXiv:2011.06709

Bibcode:

2020arXiv201106709K

Keywords:

Computer Science - Machine Learning;
Computer Science - Artificial Intelligence;
Statistics - Machine Learning

E-Print:

Originally appeared at the NeurIPS 2016 "Future of Interactive Learning Machines (FILM)" workshop

NASA/ADS

Active Reinforcement Learning: Observing Rewards at a Cost

Abstract