Generalizing Statistical Functions for Climate Model Machine Learning and Ensemble Consistency Tests
Abstract
Principal component analysis (PCA) is an important machine learning tool in geoscience. However, there are expansive classes of closely-related algorithms that are generally out of reach for the average scientist, particularly algorithms that extend PCA to higher-order moments, which could lead to insights about nonlinear features and interactions. Our previous work shows that it is possible to build robust yet simple tools for computing principal components of geoscientific data with a variety of configurations and kernels. With an eye towards this as well as a future ensemble consistency test package, we have developed a suite of C++ language extensions for statistical calculations that includes explicit representation of dimensionality and allows for the templating of functions on dimensionality and arity (the number of function arguments). This representation exposes opportunities for automatic pre-optimization of symmetric dimensions and commutative function arguments, as well as parallelization on heterogeneous computer architectures. Additionally, it allows us to program a single templated function to represent the kernels necessary for dimensionality-dependent and arity-dependent machine learning algorithms. We present the current API and demonstrate some examples of code, performance, and results derived from data from the GFDL's AM4 model.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2019
- Bibcode:
- 2019AGUFMGC43D1359D
- Keywords:
-
- 0555 Neural networks;
- fuzzy logic;
- machine learning;
- COMPUTATIONAL GEOPHYSICS;
- 1626 Global climate models;
- GLOBAL CHANGE;
- 1942 Machine learning;
- INFORMATICS;
- 4313 Extreme events;
- NATURAL HAZARDS