PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

doi:10.48550/arXiv.2004.05682

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

Patch-based attacks introduce a perceptible but localized change to the input that induces misclassification. A limitation of current patch-based black-box attacks is that they perform poorly for targeted attacks, and even for the less challenging non-targeted scenarios, they require a large number of queries. Our proposed PatchAttack is query efficient and can break models for both targeted and non-targeted attacks. PatchAttack induces misclassifications by superimposing small textured patches on the input image. We parametrize the appearance of these patches by a dictionary of class-specific textures. This texture dictionary is learned by clustering Gram matrices of feature activations from a VGG backbone. PatchAttack optimizes the position and texture parameters of each patch using reinforcement learning. Our experiments show that PatchAttack achieves > 99% success rate on ImageNet for a wide range of architectures, while only manipulating 3% of the image for non-targeted attacks and 10% on average for targeted attacks. Furthermore, we show that PatchAttack circumvents state-of-the-art adversarial defense methods successfully.

Publication:

arXiv e-prints

Pub Date:

April 2020

DOI:

10.48550/arXiv.2004.05682

arXiv:

arXiv:2004.05682

Bibcode:

2020arXiv200405682Y

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

To appear in ECCV 2020

NASA/ADS

PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning

Abstract