Progress with lossy compression of data from the Community Earth System Model
Abstract
Climate models, such as the Community Earth System Model (CESM), generate massive quantities of data, particularly when run at high spatial and temporal resolutions. The burden of storage is further exacerbated by creating large ensembles, generating large numbers of variables, outputting at high frequencies, and duplicating data archives (to protect against disk failures). Applying lossy compression methods to CESM datasets is an attractive means of reducing data storage requirements, but ensuring that the loss of information does not negatively impact science objectives is critical. In particular, test methods are needed to evaluate whether critical features (e.g., extreme values and spatial and temporal gradients) have been preserved and to boost scientists' confidence in the lossy compression process. We will provide an overview on our progress in applying lossy compression to CESM output and describe our unique suite of metric tests that evaluate the impact of information loss. Further, we will describe our processes how to choose an appropriate compression algorithm (and its associated parameters) given the diversity of CESM data (e.g., variables may be constant, smooth, change abruptly, contain missing values, or have large ranges). Traditional compression algorithms, such as those used for images, are not necessarily ideally suited for floating-point climate simulation data, and different methods may have different strengths and be more effective for certain types of variables than others. We will discuss our progress towards our ultimate goal of developing an automated multi-method parallel approach for compression of climate data that both maximizes data reduction and minimizes the impact of data loss on science results.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2017
- Bibcode:
- 2017AGUFMIN11B0034X
- Keywords:
-
- 0525 Data management;
- COMPUTATIONAL GEOPHYSICS;
- 0545 Modeling;
- COMPUTATIONAL GEOPHYSICS;
- 0550 Model verification and validation;
- COMPUTATIONAL GEOPHYSICS;
- 0599 General or miscellaneous;
- COMPUTATIONAL GEOPHYSICS