Evaluating and Understanding the Robustness of Adversarial Logit Pairing
Abstract
We evaluate the robustness of Adversarial Logit Pairing, a recently proposed defense against adversarial examples. We find that a network trained with Adversarial Logit Pairing achieves 0.6% accuracy in the threat model in which the defense is considered. We provide a brief overview of the defense and the threat models/claims considered, as well as a discussion of the methodology and results of our attack, which may offer insights into the reasons underlying the vulnerability of ALP to adversarial attack.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2018
- DOI:
- arXiv:
- arXiv:1807.10272
- Bibcode:
- 2018arXiv180710272E
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Cryptography and Security;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning
- E-Print:
- NeurIPS SECML 2018. Source code at https://github.com/labsix/adversarial-logit-pairing-analysis