Accelerating the Operation of Complex Workflows through Standard Data Interfaces
Abstract
In this position paper we argue for standardizing how we share and process data in scientific workflows at the network-level to maximize step re-use and workflow portability across platforms and networks in pursuit of a foundational workflow stack. We look to evolve workflows from steps connected point-to-point in a directed acyclic graph (DAG) to steps connected via shared channels in a message system implemented as a network service. To start this evolution, we contribute: a preliminary reference model, architecture, and open tools to implement the architecture today. Our goal stands to improve the deployment and operation of complex workflows by decoupling data sharing and data processing in workflow steps. We seek the workflow community's input on this approach's merit, related research to explore and initial requirements from the workflows community to inform future research.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- arXiv:
- arXiv:2412.13339
- Bibcode:
- 2024arXiv241213339P
- Keywords:
-
- Computer Science - Distributed, Parallel, and Cluster Computing;
- Computer Science - Networking and Internet Architecture
- E-Print:
- 2 pages, 2 figures, accepted at the 19th Workshop on Workflows in Support of Large-Scale Science (WORKS24), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC24