Extensions to the Basic Model Interface to Support Serialization of a Model's State Variables for Load Balancing and Checkpointing
Abstract
In coordination with federal water prediction partners, NOAA's Office of Water Prediction (OWP) leads development of a model coupling framework called the Next Generation Water Resources Modeling Framework (Nextgen). This framework uses a non-invasive, community-standard API for computational models called the Basic Model Interface (BMI). Nextgen uses an Adapter-Mediator pattern to access model functions using calls to BMI functions. BMI function calls provide fine-grained control (initialize, update, finalize), variable getters and setters, and functions to retrieve information about a models input and output variables (e.g. grid, rank, and units). This information allows the framework to automatically call mediators, when needed, to facilitate passing values of variables between models (e.g. for regridding, time interpolation and unit conversion). BMI supports models written in C, C++, Fortran and Python. The Nextgen project has implemented BMI for several hydrologic models and has run proof-of-concept tests. The original BMI standard did not envision running models on an HPC system with load balancing or checkpointing requirements. We added several new functions to BMI to support the serialization and deserialization of all of a models state variables. This supports load balancing -- stopping a model, moving it to a different computational node and restarting it -- as well as checkpointing, or restarting a model from a checkpoint after a hardware or power failure. Efficient serialization and deserialization is a complex and error-prone task that model developers shouldnt need to worry about. The BMI standard aims to simplify the task of model interoperability. BMI-enabled models need not be aware of other BMI-enabled models, and developers of these models need not anticipate how their BMI functionality might be used. Consistent with this philosophy, we adopted a separation-of-concerns approach where (1) the model developer implements some fairly simple new functions for getting and setting the models state variables, and (2) a general framework utility makes use of these new functions to do the work of serialization and deserialization. These new capabilities will support load balancing and checkpointing when the Nextgen framework is running a large number of model instances on an HPC system.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2021
- Bibcode:
- 2021AGUFM.H44D..04P