GAMER: A Graphic Processing Unit Accelerated Adaptive-Mesh-Refinement Code for Astrophysics
Abstract
We present the newly developed code, GPU-accelerated Adaptive-MEsh-Refinement code (GAMER), which adopts a novel approach in improving the performance of adaptive-mesh-refinement (AMR) astrophysical simulations by a large factor with the use of the graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-dimensional relaxing total variation diminishing scheme for the hydrodynamic solver and a multi-level relaxation scheme for the Poisson solver. Both solvers have been implemented in GPU, by which hundreds of patches can be advanced in parallel. The computational overhead associated with the data transfer between the CPU and GPU is carefully reduced by utilizing the capability of asynchronous memory copies in GPU, and the computing time of the ghost-zone values for each patch is diminished by overlapping it with the GPU computations. We demonstrate the accuracy of the code by performing several standard test problems in astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster system. We measure the performance of the code by performing purely baryonic cosmological simulations in different hardware implementations, in which detailed timing analyses provide comparison between the computations with and without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are demonstrated using one GPU with 40963 effective resolution and 16 GPUs with 81923 effective resolution, respectively.
- Publication:
-
The Astrophysical Journal Supplement Series
- Pub Date:
- February 2010
- DOI:
- 10.1088/0067-0049/186/2/457
- arXiv:
- arXiv:0907.3390
- Bibcode:
- 2010ApJS..186..457S
- Keywords:
-
- gravitation;
- hydrodynamics;
- methods: numerical;
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Astrophysics - Cosmology and Extragalactic Astrophysics
- E-Print:
- 60 pages, 22 figures, 3 tables. More accuracy tests are included. Accepted for publication in ApJS