Efficient and Flexible Climate Analysis with Python in a Cloud-Based Distributed Computing Framework
Abstract
As climate models become progressively more advanced, and spatial resolution further improved through various downscaling projects, climate projections at a local level are increasingly insightful and valuable. However, the raw size of climate datasets presents numerous hurdles for analysts wishing to develop customized climate risk metrics or perform site-specific statistical analysis. Four Twenty Seven, a climate risk consultancy, has implemented a Python-based distributed framework to analyze large climate datasets in the cloud. With the freedom afforded by efficiently processing these datasets, we are able to customize and continually develop new climate risk metrics using the most up-to-date data. Here we outline our process for using Python packages such as XArray and Dask to evaluate netCDF files in a distributed framework, StarCluster to operate in a cluster-computing environment, cloud computing services to access publicly hosted datasets, and how this setup is particularly valuable for generating climate change indicators and performing localized statistical analysis.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2017
- Bibcode:
- 2017AGUFMIN21D0057G
- Keywords:
-
- 1626 Global climate models;
- GLOBAL CHANGE;
- 1920 Emerging informatics technologies;
- INFORMATICS;
- 1932 High-performance computing;
- INFORMATICS;
- 1994 Visualization and portrayal;
- INFORMATICS