Automating Physical and Machine Learning Models using Scientific Workflows
Abstract
Processing and analysis of large data sets in earth science often requires a complex execution of many tasks. In order to deal with such tasks, scientific workflow management systems (WMS) have been developed with the objective to simplify and automate such complex data- and compute-intensive workflows. WMS that can deal in high-level way with the heterogenous peculiarities of cutting edge high-performance computing (HPC) architectures are rare. To evaluate the applicability of WMS for both compute- and data-intensive HPC, we investigate two different scientific case studies: (i) data analysis of remotely sensed images using supervised machine learning and (ii) coupling of a continuum and a discrete element ice dynamic model. These scenarios are intrinsically running many tasks, but were -before applying WMS- dependent on handwritten scripts and needed human intervention during execution; in addition, it was quite challenging when any change had to be made to these workflows. We describe the internals of both use cases in terms of challenges and opportunities, and show how to automate these workflows using the UNICORE WMS. UNICORE is a state-of-the-art WMS that provides a set of platform- and science-agnostic tools to enable, compose, orchestrate, and reuse workflows easily on distributed, heterogeneous and multi-site HPC resource environments. By this means, the challenges posed by the above case studies could be solved in a high-level and automated way, in particular with respect to handling of the involved big data.
This work was kindly supported by NordForsk as part of the Nordic Center of Excellence (NCoE) eSTICC (eScience Tools for Investigating Climate Change at High Northern Latitudes) and the Top-level Research Initiative NCoE SVALI (Stability and Variation of Arctic Land Ice).- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2018
- Bibcode:
- 2018AGUFMIN34B..08M
- Keywords:
-
- 1912 Data management;
- preservation;
- rescue;
- INFORMATICSDE: 1916 Data and information discovery;
- INFORMATICSDE: 1930 Data and information governance;
- INFORMATICSDE: 1942 Machine learning;
- INFORMATICS