Interpreting convolutional neural network decision for earthquake detection with backward optimisation and layer-wise relevance propagation methods
Abstract
Deep learning (DL) models have already been implemented for solving many seismological issues, such as the identification of earthquake waveforms in noisy continuous data, the classification of earthquake waveforms using common earthquake characteristics (eg. magnitude, epicentral distance, hypocenter depth), the identification of P- and S- wave arrivals, the implementation of early warning systems, etc. Numerous studies reported successful performances of DL models compared to classical seismological state-of-art methods, however the understanding and interpretability of their decisions are being questioned. These highly parametrised models, so far, have mostly been ranked by their accuracy performances. In this study, we propose using backward optimisation and layer-wise relevance propagation methods to scrutinise convolutional neural network (CNN) decisions for the earthquake detection. These methods can help us answer questions such as: What is the optimal earthquake signal according to CNN? Which parts of the earthquake signal are more/less relevant for the model to recognise the earthquake signal? Eventually, these finding might help us to understand how to build a better CNN model and whether there is any physical meaning embedded in the model. The CNN model used in this study had been trained for the single-station detection, thus the input is 25 seconds long three-component waveform and the model output is the binary class signal or noise. The normalised non-filtered training dataset consists of positive (the earthquake signals) and negative (the noise signals) samples of 25 seconds duration. The positive samples span a wide range of earthquakes, from local to teleseismic, with a focus on the local and regional ones. With these two methods we were able to visualize that the most relevant part for the CNN earthquake detector is the part of the signal where there is the most energy, thus the part related to the S wave and then the P wave. The model is not focusing on the time interval before P wave arrival, and it is as well ignoring the coda part of the signal.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2021
- Bibcode:
- 2021AGUFM.S34A..04M