Putting Predictive Models to Use: Scoring of Unseen Streaming Data using a Multivariate Time Series Classification Tool
Abstract
Advances in data collection and data storage technologies have made the assembly of multivariate time series data more common. Data analysis and extraction of knowledge from such massive and complex datasets encountered in space physics today present a major obstacle to fully utilizing our vast data repositories and to scientific progress. In the previous years we introduced a time series classification tool MineTool-TS [Karimabadi et al, 2009] and its extension to simulation and streaming data [Sipes& Karimabadi, 2012, 2013]. In this work we demonstrate the applicability and real world utility of the predictive models created using the tool to scoring and labeling of a large dataset of unseen, streaming data. Predictive models that are created are based on the assumption that the training data used to create them is a true representative of the population. Multivariate time series datasets are also characterized by large amounts of variability and potential background noise. Moreover, there are multiple issues being raised by the streaming nature of the data. In this work we illustrate how we dealt with these challenges and demonstrate the results in a study of flux ropes in the plasma sheet. We have used an iterative process of building a predictive model using the original labeled training set, tested it on a week worth of streaming data, had the results checked by a scientific expert in the domain, and fed the results and the labels back into the training set, creating a large training set and using it to produce the final model. This final model was then put to use to predict a very large, unseen, six month period of streaming data. In this work we present the results of our machine learning approach to automatically detect flux ropes in spacecraft data.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2013
- Bibcode:
- 2013AGUFMSM53D2257S
- Keywords:
-
- 2744 MAGNETOSPHERIC PHYSICS Magnetotail;
- 1942 INFORMATICS Machine learning