Review of white box methods for explanations of convolutional neural networks in image classification tasks
Abstract
In recent years, deep learning has become prevalent to solve applications from multiple domains. Convolutional neural networks (CNNs) particularly have demonstrated state-of-the-art performance for the task of image classification. However, the decisions made by these networks are not transparent and cannot be directly interpreted by a human. Several approaches have been proposed to explain the reasoning behind a prediction made by a network. We propose a topology of grouping these methods based on their assumptions and implementations. We focus primarily on white box methods that leverage the information of the internal architecture of a network to explain its decision. Given the task of image classification and a trained CNN, our work aims to provide a comprehensive and detailed overview of a set of methods that can be used to create explanation maps for a particular image, which assign an importance score to each pixel of the image based on its contribution to the decision of the network. We also propose a further classification of the white box methods based on their implementations to enable better comparisons and help researchers find methods best suited for different scenarios.
- Publication:
-
Journal of Electronic Imaging
- Pub Date:
- September 2021
- DOI:
- 10.1117/1.JEI.30.5.050901
- arXiv:
- arXiv:2104.02548
- Bibcode:
- 2021JEI....30e0901A
- Keywords:
-
- explainable artificial intelligence;
- deep learning;
- convolutional neural networks;
- object classification;
- interpretability;
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning
- E-Print:
- Submitted to Journal of Electronic Imaging (JEI)