Reinforcement Learning under Threats

doi:10.48550/arXiv.1809.01560

Reinforcement Learning under Threats

In several reinforcement learning (RL) scenarios, mainly in security settings, there may be adversaries trying to interfere with the reward generating process. In this paper, we introduce Threatened Markov Decision Processes (TMDPs), which provide a framework to support a decision maker against a potential adversary in RL. Furthermore, we propose a level-$k$ thinking scheme resulting in a new learning framework to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries while the agent learns.

Publication:

arXiv e-prints

Pub Date:

September 2018

DOI:

10.48550/arXiv.1809.01560

arXiv:

arXiv:1809.01560

Bibcode:

2018arXiv180901560G

Keywords:

Computer Science - Machine Learning;
Computer Science - Artificial Intelligence;
Computer Science - Cryptography and Security;
Statistics - Machine Learning

E-Print:

Extends the verson published at the Proceedings of the AAAI Conference on Artificial Intelligence 33, https://www.aaai.org/ojs/index.php/AAAI/article/view/5106

NASA/ADS

Reinforcement Learning under Threats

Abstract