Performance and accuracy of a GRAPE-3 system for collisionless N-body simulations
Abstract
The performance and accuracy of a GRAPE-3 system for collisionless N-body simulations is discussed. After an initial description of the hardware configurations available to us at Marseille, and the usefulness of on-line analysis, we concentrate on the actual performance and accuracy of direct summation and of tree code software. For the former we discuss the sources of round-off errors. The standard Barnes-Hut tree code cannot be used as such on a GRAPE-3 system. Instead particles are divided into blocks and the tree traversal is performed for the whole block, instead of for each particle in the block separately. The forces are then calculated by direct summation over the whole interaction list. The performance of the tree code depends on the number of particles in the block, the optimum number depending on the speed of the front end and the number of boards. We find that the code scales as O(N) and explain this behaviour. The time per step decreases as the tolerance increases, but the dependence is much weaker than for the standard tree code. Finally, we find that, contrary to what is expected for the standard version, the speed of our tree code increases with the clustering of the configuration. We discuss the effect of the front end and compare the performance of direct summation and of tree code on GRAPE-3 with that of other software on general purpose computers. The accuracy of both direct summation and the tree code is discussed as a function of number of particles and softening. For this we consider the accuracy of the force calculation as well as the energy conservation during a simulation. Because of the increased role of the direct summation in the force calculation, our tree code is much more accurate than the standard one. Finally, we follow the evolution of an isolated barred galaxy using different hardware and software in order to assess the reliability and reproducibility of our results. We find excellent agreement between the pattern speed of the bar in direct summation simulations run on the high-precision GRAPE-4 machines and that in direct summation simulations run on our GRAPE-3 system. The agreement with the tree code is also very good provided the tolerance values are smaller than about 1.0. We conclude that GRAPE-3 systems are well suited for collisionless simulations and in particular for those of galaxies. This is due to their good accuracy and their high speed, which allows the use of a large number of particles.
- Publication:
-
Monthly Notices of the Royal Astronomical Society
- Pub Date:
- February 1998
- DOI:
- 10.1046/j.1365-8711.1998.01102.x
- arXiv:
- arXiv:astro-ph/9709246
- Bibcode:
- 1998MNRAS.293..369A
- Keywords:
-
- METHODS: NUMERICAL;
- GALAXIES: KINEMATICS AND DYNAMICS;
- GALAXIES: STRUCTURE;
- Astrophysics
- E-Print:
- 13 pages Latex, with 11 figures, accepted for publication in MNRAS