Delving into adversarial attacks on deep policies
Abstract
Adversarial examples have been shown to exist for a variety of deep learning architectures. Deep reinforcement learning has shown promising results on training agent policies directly on raw inputs such as image pixels. In this paper we present a novel study into adversarial attacks on deep reinforcement learning polices. We compare the effectiveness of the attacks using adversarial examples vs. random noise. We present a novel method for reducing the number of times adversarial examples need to be injected for a successful attack, based on the value function. We further explore how re-training on random noise and FGSM perturbations affects the resilience against adversarial examples.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2017
- DOI:
- arXiv:
- arXiv:1705.06452
- Bibcode:
- 2017arXiv170506452K
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- ICLR 2017 Workshop