Bayesian Adversarial Spheres: Bayesian Inference and Adversarial Examples in a Noiseless Setting
Abstract
Modern deep neural network models suffer from adversarial examples, i.e. confidently misclassified points in the input space. It has been shown that Bayesian neural networks are a promising approach for detecting adversarial points, but careful analysis is problematic due to the complexity of these models. Recently Gilmer et al. (2018) introduced adversarial spheres, a toy set-up that simplifies both practical and theoretical analysis of the problem. In this work, we use the adversarial sphere set-up to understand the properties of approximate Bayesian inference methods for a linear model in a noiseless setting. We compare predictions of Bayesian and non-Bayesian methods, showcasing the advantages of the former, although revealing open challenges for deep learning applications.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2018
- DOI:
- 10.48550/arXiv.1811.12335
- arXiv:
- arXiv:1811.12335
- Bibcode:
- 2018arXiv181112335B
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Machine Learning
- E-Print:
- To appear in the third workshop on Bayesian Deep Learning (NeurIPS 2018), Montreal, Canada