Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing
Abstract
It is well-known that classifiers are vulnerable to adversarial perturbations. To defend against adversarial perturbations, various certified robustness results have been derived. However, existing certified robustnesses are limited to top-1 predictions. In many real-world applications, top-$k$ predictions are more relevant. In this work, we aim to derive certified robustness for top-$k$ predictions. In particular, our certified robustness is based on randomized smoothing, which turns any classifier to a new classifier via adding noise to an input example. We adopt randomized smoothing because it is scalable to large-scale neural networks and applicable to any classifier. We derive a tight robustness in $\ell_2$ norm for top-$k$ predictions when using randomized smoothing with Gaussian noise. We find that generalizing the certified robustness from top-1 to top-$k$ predictions faces significant technical challenges. We also empirically evaluate our method on CIFAR10 and ImageNet. For example, our method can obtain an ImageNet classifier with a certified top-5 accuracy of 62.8\% when the $\ell_2$-norms of the adversarial perturbations are less than 0.5 (=127/255). Our code is publicly available at: \url{https://github.com/jjy1994/Certify_Topk}.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2019
- DOI:
- 10.48550/arXiv.1912.09899
- arXiv:
- arXiv:1912.09899
- Bibcode:
- 2019arXiv191209899J
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Cryptography and Security;
- Statistics - Machine Learning
- E-Print:
- ICLR 2020, code is available at this: https://github.com/jjy1994/Certify_Topk