Topology-preserving Adversarial Training for Alleviating Natural Accuracy Degradation
Abstract
Despite the effectiveness in improving the robustness of neural networks, adversarial training has suffered from the natural accuracy degradation problem, i.e., accuracy on natural samples has reduced significantly. In this study, we reveal that natural accuracy degradation is highly related to the disruption of the natural sample topology in the representation space by quantitative and qualitative experiments. Based on this observation, we propose Topology-pReserving Adversarial traINing (TRAIN) to alleviate the problem by preserving the topology structure of natural samples from a standard model trained only on natural samples during adversarial training. As an additional regularization, our method can be combined with various popular adversarial training algorithms, taking advantage of both sides. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet show that our proposed method achieves consistent and significant improvements over various strong baselines in most cases. Specifically, without additional data, TRAIN achieves up to 8.86% improvement in natural accuracy and 6.33% improvement in robust accuracy.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2023
- DOI:
- 10.48550/arXiv.2311.17607
- arXiv:
- arXiv:2311.17607
- Bibcode:
- 2023arXiv231117607M
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning
- E-Print:
- BMVC 2024