Using Pre-Training Can Improve Model Robustness and Uncertainty

doi:10.48550/arXiv.1901.09960

Using Pre-Training Can Improve Model Robustness and Uncertainty

He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training. We show that although pre-training may not improve performance on traditional classification metrics, it improves model robustness and uncertainty estimates. Through extensive experiments on adversarial examples, label corruption, class imbalance, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. We introduce adversarial pre-training and show approximately a 10% absolute improvement over the previous state-of-the-art in adversarial robustness. In some cases, using pre-training without task-specific methods also surpasses the state-of-the-art, highlighting the need for pre-training when evaluating future methods on robustness and uncertainty tasks.

Publication:

arXiv e-prints

Pub Date:

January 2019

DOI:

10.48550/arXiv.1901.09960

arXiv:

arXiv:1901.09960

Bibcode:

2019arXiv190109960H

Keywords:

Computer Science - Machine Learning;
Computer Science - Computer Vision and Pattern Recognition;
Statistics - Machine Learning

E-Print:

ICML 2019. PyTorch code here: https://github.com/hendrycks/pre-training Figure 3 updated

ADS

Using Pre-Training Can Improve Model Robustness and Uncertainty

Abstract