Stability and Generalization in Free Adversarial Training

doi:10.48550/arXiv.2404.08980

Stability and Generalization in Free Adversarial Training

While adversarial training methods have resulted in significant improvements in the deep neural nets' robustness against norm-bounded adversarial perturbations, their generalization performance from training samples to test data has been shown to be considerably worse than standard empirical risk minimization methods. Several recent studies seek to connect the generalization behavior of adversarially trained classifiers to various gradient-based min-max optimization algorithms used for their training. In this work, we study the generalization performance of adversarial training methods using the algorithmic stability framework. Specifically, our goal is to compare the generalization performance of the vanilla adversarial training scheme fully optimizing the perturbations at every iteration vs. the free adversarial training simultaneously optimizing the norm-bounded perturbations and classifier parameters. Our proven generalization bounds indicate that the free adversarial training method could enjoy a lower generalization gap between training and test samples due to the simultaneous nature of its min-max optimization algorithm. We perform several numerical experiments to evaluate the generalization performance of vanilla, fast, and free adversarial training methods. Our empirical findings also show the improved generalization performance of the free adversarial training method and further demonstrate that the better generalization result could translate to greater robustness against black-box attack schemes. The code is available at https://github.com/Xiwei-Cheng/Stability_FreeAT.

Publication:

arXiv e-prints

Pub Date:

April 2024

DOI:

10.48550/arXiv.2404.08980

arXiv:

arXiv:2404.08980

Bibcode:

2024arXiv240408980C

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

NASA/ADS

Stability and Generalization in Free Adversarial Training

Abstract