Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

doi:10.48550/arXiv.1802.00420

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

Publication:

arXiv e-prints

Pub Date:

February 2018

DOI:

10.48550/arXiv.1802.00420

arXiv:

arXiv:1802.00420

Bibcode:

2018arXiv180200420A

Keywords:

Computer Science - Machine Learning;
Computer Science - Artificial Intelligence;
Computer Science - Cryptography and Security

E-Print:

ICML 2018. Source code at https://github.com/anishathalye/obfuscated-gradients

ADS

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Abstract