Massive Computations in 3D Turbulent Flow: Science Goals and an Asynchronous GPU Algorithm Towards the Exascale
Abstract
Many fluid flows in geophysical problems are three-dimensional and turbulent, often subjected to other influences such as magnetic fields, solid-body rotation, or density stratification, but always highly nonlinear and characterized by a wide range of scales. Numerical simulations are vital to progress, but resolution requirements especially at high Reynolds number imply a continuing need for advance supercomputing power. Most of the largest computations, conducted in simplified geometries, have employed CPU-based massive parallelism. For example 81923 simulations performed using Fourier pseudo-spectral methods have provided important insights in fine-scale turbulence, and elongated domains at comparable resolution are useful for the study of turbulence under a strong magnetic field. However, as computing advances to the pre-Exascale era dominated by accelerators such as GPUs, a substantial re-think is necessary for many existing paradigms, especially those which tend to be communication-intensive. We have developed an asynchronous algorithm with one-dimensional domain decomposition optimized for machines with large CPU memory and fast GPUs, in particular SUMMIT at the Oak Ridge National Laboratory, which consists of IBM Power-9 CPUs and NVIDIA V100 GPUs. Data located in the CPU memory are processed in a fine-grained (batch) manner by overlapping high bandwidth NVLINK transfers, such that fast GPU computations in combination with high handwidth system interconnect allow a much larger problem to be addressed than the much smaller GPU memory might suggest. Use of pinned memory and zerocopy memory movement also allow transfer of strided data between the CPU and GPU to be handled with high NVLINK throughput. Several advanced communication protocols are explored in order to obtain maximum network performance for collective communication. A more generic code for three-dimensional fast Fourier transforms based on similar principles has also been built.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2018
- Bibcode:
- 2018AGUFMNG32A..07Y
- Keywords:
-
- 0545 Modeling;
- COMPUTATIONAL GEOPHYSICSDE: 1942 Machine learning;
- INFORMATICSDE: 4430 Complex systems;
- NONLINEAR GEOPHYSICSDE: 4490 Turbulence;
- NONLINEAR GEOPHYSICS