The MineTool Software Suite: A Novel Data Mining Palette of Tools for Automated Modeling of Space Physics Data
Abstract
We present a new data mining software tool called MineTool for analysis and modeling of space physics data. MineTool is a graphical user interface implementation that merges two data mining algorithms into an easy-to-use software tool: an algorithm for analysis and modeling of static data [Karimabadi et al, 2007] and MineTool-TS, an algorithm for data mining of time series data [Karimabadi et al, 2009]. By virtue of automating the modeling process and model evaluations, MineTool makes data mining and predictive modeling more accessible to non-experts. The software is entirely in Java and freeware. By ranking all inputs as predictors of the outcome before constructing a model, MineTool enables inclusion of only relevant variables as well. The technique aggregates the various stages of model building into a four-step process consisting of (i) data segmentation and sampling, (ii) variable pre-selection and transform generation, (iii) predictive model estimation and validation, and (iv) final model selection. Optimal strategies are chosen for each modeling step. A notable feature of the technique is that the final model is always in closed analytical form rather than “black box” form characteristic of some other techniques. Having the analytical model enables deciphering the importance of various variables to affecting the outcome. MineTool suite also provides capabilities for data preparation for data mining as well as visualization of the datasets. MineTool has successfully been used to develop models for automated detection of flux transfer events (FTEs) at Earth’s magnetopause in the Cluster spacecraft time series data and 3D magnetopause modeling. In this presentation, we demonstrate the ease of use of the software through examples including how it was used in the FTE problem.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2009
- Bibcode:
- 2009AGUFMSM11B1588S
- Keywords:
-
- 1914 INFORMATICS / Data mining;
- 1976 INFORMATICS / Software tools and services;
- 2794 MAGNETOSPHERIC PHYSICS / Instruments and techniques;
- 2799 MAGNETOSPHERIC PHYSICS / General or miscellaneous