An unsupervised automatic classification algorithm for continuous seismic records using a nonparametric Bayesian approach
Abstract
Continuous seismic records include signals produced by various factors such as earthquakes, human activities, and instrumental noises. An automatic extraction and classification technique for those signals may allow us to understand geophysical phenomena around a seismometer. The technique would be also applicable to automated monitoring of seismic records used for a real-time system. To those ends, we have been developing a classification method using a machine learning approach. Specifically, an unsupervised learning is employed to make the method applicable to various seismometers deployed in different observation environments.
Kodera and Sakai (2018, AGU) proposed an algorithm classifying continuous records into 10 time series models using a multi-step clustering algorithm in the frequency and time domains and showed that earthquakes and various noises can be classified automatically without prior knowledge. The proposed method, however, has a shortcoming in that the spectral clustering algorithm is used for the time domain clustering, which requires the exact number of clusters as a hyper-parameter and therefore has difficulty in determining the parameter without subjective decisions. To enhance the objectivity of the time domain classification, we introduced the infinite relational model (IRM; Kemp et al., 2006), a nonparametric Bayesian model that assumes infinite clusters and determines the number of effective clusters automatically based on a training dataset. We tested the new method using 72-hour waveforms recorded at station E.JDJM of MeSO-net (Kawakita and Sakai, 2009; a seismometer located near a subway) and temporal ocean-bottom seismometer (OBS) TN042A (Yamazaki et al., 2008; OBS deployed off the Kii peninsula). The E.JDJM and TN042A records were classified into 10 and 6 classes, respectively, owing to IRM. The new method successfully assigned different classes for earthquakes and train noises at E.JDJM and large aftershocks and low-frequency tremors at TN042A. However, those signals and classes basically did not have a one-to-one relationship; for instance, the train noises at E.JDJM were subdivided into 5 different classes. An additional procedure such as further clustering fragmented classes may be needed to make the classification results understandable more intuitively.- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2019
- Bibcode:
- 2019AGUFM.S43D0684K
- Keywords:
-
- 0555 Neural networks;
- fuzzy logic;
- machine learning;
- COMPUTATIONAL GEOPHYSICS;
- 1910 Data assimilation;
- integration and fusion;
- INFORMATICS;
- 1914 Data mining;
- INFORMATICS;
- 1942 Machine learning;
- INFORMATICS