Topology-preserving Adversarial Training for Alleviating Natural Accuracy Degradation

doi:10.48550/arXiv.2311.17607

Topology-preserving Adversarial Training for Alleviating Natural Accuracy Degradation

Despite the effectiveness in improving the robustness of neural networks, adversarial training has suffered from the natural accuracy degradation problem, i.e., accuracy on natural samples has reduced significantly. In this study, we reveal that natural accuracy degradation is highly related to the disruption of the natural sample topology in the representation space by quantitative and qualitative experiments. Based on this observation, we propose Topology-pReserving Adversarial traINing (TRAIN) to alleviate the problem by preserving the topology structure of natural samples from a standard model trained only on natural samples during adversarial training. As an additional regularization, our method can be combined with various popular adversarial training algorithms, taking advantage of both sides. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet show that our proposed method achieves consistent and significant improvements over various strong baselines in most cases. Specifically, without additional data, TRAIN achieves up to 8.86% improvement in natural accuracy and 6.33% improvement in robust accuracy.

Publication:

arXiv e-prints

Pub Date:

November 2023

DOI:

10.48550/arXiv.2311.17607

arXiv:

arXiv:2311.17607

Bibcode:

2023arXiv231117607M

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Machine Learning

E-Print:

BMVC 2024

NASA/ADS

Topology-preserving Adversarial Training for Alleviating Natural Accuracy Degradation

Abstract