Development of a distributed memory parallel multiphase model for the direct numerical simulation of bottom boundary layer turbulence under combined wave-current flows
Abstract
Fine sediment transport and its potential to dampen turbulence under energetic waves and combined wave-current flows are critical to better understanding of the fate of terrestrial sediment particles in the river mouth and eventually, coastal morphodynamics. The unsteady nature of these oscillatory flows necessitates a computationally intense, turbulence resolving approach. Whereas a sophisticated shared memory parallel model has been successfully used to simulate these flows in the intermittently turbulent regime (Remax ~ 1000), scaling issues of shared memory computational hardware limit the applicability of the model to perform very high resolution (> 192x192x193) simulations within reasonable wall-clock times. Thus to meet the need to simulate high resolution, fully turbulent oscillatory flows, a new hybrid shared memory / distributed memory parallel model has been developed. Using OpenMP and MPI constructs, this new model implements a highly-accurate pseudo-spectral scheme in an idealized oscillatory bottom boundary layer (OBBL). Data is stored locally and transferred between computational nodes as appropriate such that FFTs used to calculate derivatives in the x and y-directions and the Chebyshev polynomials used to calculated derivatives in the z-direction are calculated completely in-processor. The model is fully configurable at compile time to support: multiple methods of operation (serial or OpenMP, MPI, OpenMP+MPI parallel), available FFT libraries (DFTI, FFTW3), high temporal resolution timing, persistent or non-persistent MPI, etc. Output is fully distributed to support both independent and shared filesystems. At run time, the model automatically selects the best performing algorithms given the computational resources and domain size. Nearly 40 Integrated test routines (derivatives, FFT transformations, eigenvalues, Poission / Helmholtz solvers, etc.) are used to validate individual components of the model. Test simulations have been performed at the University of Florida High Performance Computing Center and at the Arctic Regional Supercomputing Center. Simulated flows are shown to be consistent, producing identical results using all combinations of processors and parallel algorithmns (OpenMP, MPI, OpenMP/MPI) up through 2048 processors (openmp, mpi and openmp+mpi). For larger grid sizes (256x256x257) parallel efficiency is > 50% up through 256 processors while between 512-2048 processors, the efficiency is between 20 and 30%. For 2048, the parallel version is 412 times faster than the serial version. The model was been compared with the original shared memory model and had been shown to produce nearly identical results (sediment/velocity profiles, power spectrum, etc.). Test simulations up to 1000x1000x1001 have been performed with even large simulations planned all of which show significant improvement in parallel efficiencies.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2012
- Bibcode:
- 2012AGUFMEP23D0846D
- Keywords:
-
- 0545 COMPUTATIONAL GEOPHYSICS / Modeling;
- 1932 INFORMATICS / High-performance computing;
- 4558 OCEANOGRAPHY: PHYSICAL / Sediment transport;
- 4568 OCEANOGRAPHY: PHYSICAL / Turbulence;
- diffusion;
- and mixing processes