C programs for solving the time-dependent Gross-Pitaevskii equation in a fully anisotropic trap
Abstract
We present C programming language versions of earlier published Fortran programs (Muruganandam and Adhikari (2009) [1]) for calculating both stationary and non-stationary solutions of the time-dependent Gross-Pitaevskii (GP) equation. The GP equation describes the properties of dilute Bose-Einstein condensates at ultra-cold temperatures. C versions of programs use the same algorithms as the Fortran ones, involving real- and imaginary-time propagation based on a split-step Crank-Nicolson method. In a one-space-variable form of the GP equation, we consider the one-dimensional, two-dimensional, circularly-symmetric, and the three-dimensional spherically-symmetric harmonic-oscillator traps. In the two-space-variable form, we consider the GP equation in two-dimensional anisotropic and three-dimensional axially-symmetric traps. The fully-anisotropic three-dimensional GP equation is also considered. In addition to these twelve programs, for six algorithms that involve two and three space variables, we have also developed threaded (OpenMP parallelized) programs, which allow numerical simulations to use all available CPU cores on a computer. All 18 programs are optimized and accompanied by makefiles for several popular C compilers. We present typical results for scalability of threaded codes and demonstrate almost linear speedup obtained with the new programs, allowing a decrease in execution times by an order of magnitude on modern multi-core computers. New version program summary Program title: GP-SCL package, consisting of: (i) imagtime1d, (ii) imagtime2d, (iii) imagtime2d-th, (iv) imagtimecir, (v) imagtime3d, (vi) imagtime3d-th, (vii) imagtimeaxial, (viii) imagtimeaxial-th, (ix) imagtimesph, (x) realtime1d, (xi) realtime2d, (xii) realtime2d-th, (xiii) realtimecir, (xiv) realtime3d, (xv) realtime3d-th, (xvi) realtimeaxial, (xvii) realtimeaxial-th, (xviii) realtimesph. Catalogue identifier: AEDU_v2_0. Program Summary URL: http://cpc.cs.qub.ac.uk/summaries/AEDU_v2_0.html. Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland. Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html. No. of lines in distributed program, including test data, etc.: 180 583. No. of bytes in distributed program, including test data, etc.: 1 188 688. Distribution format: tar.gz. Programming language: C and C/OpenMP. Computer: Any modern computer with C language compiler installed. Operating system: Linux, Unix, Mac OS, Windows. RAM: Memory used with the supplied input files: 2-4 MB (i, iv, ix, x, xiii, xvi, xvii, xviii), 8 MB (xi, xii), 32 MB (vii, viii), 80 MB (ii, iii), 700 MB (xiv, xv), 1.2 GB (v, vi). Number of processors used: For threaded (OpenMP parallelized) programs, all available CPU cores on the computer. Classification: 2.9, 4.3, 4.12. Catalogue identifier of previous version: AEDU_v1_0. Journal reference of previous version: Comput. Phys. Commun. 180 (2009) 1888. Does the new version supersede the previous version?: No. Nature of problem: These programs are designed to solve the time-dependent Gross-Pitaevskii (GP) nonlinear partial differential equation in one-, two- or three-space dimensions with a harmonic, circularly-symmetric, spherically-symmetric, axially-symmetric or fully anisotropic trap. The GP equation describes the properties of a dilute trapped Bose-Einstein condensate. Solution method: The time-dependent GP equation is solved by the split-step Crank-Nicolson method by discretizing in space and time. The discretized equation is then solved by propagation, in either imaginary or real time, over small time steps. The method yields solutions of stationary and/or non-stationary problems. Reasons for the new version: Previous Fortran programs [1] are used within the ultra-cold atoms [2-11] and nonlinear optics [12,13] communities, as well as in various other fields [14-16]. This new version represents translation of all programs to the C programming language, which will make it accessible to the wider parts of the corresponding communities. It is well known that numerical simulations of the GP equation in highly experimentally relevant geometries with two or three space variables are computationally very demanding, which presents an obstacle in detailed numerical studies of such systems. For this reason, we have developed threaded (OpenMP parallelized) versions of programs imagtime2d, imagtime3d, imagtimeaxial, realtime2d, realtime3d, realtimeaxial, which are named imagtime2d-th, imagtime3d-th, imagtimeaxial-th, realtime2d-th, realtime3d-th, realtimeaxial-th, respectively. Fig. 1 shows the scalability results obtained for OpenMP versions of programs realtime2d and realtime3d. As we can see, the speedup is almost linear, and on a computer with the total of 8 CPU cores we observe in Fig. 1(a) a maximal speedup of around 7, or roughly 90% of the ideal speedup, while on a computer with 12 CPU cores we find in Fig. 1(b) that the maximal speedup is around 9.6, or 80% of the ideal speedup. Such a speedup represents significant improvement in the performance. Summary of revisions: All Fortran programs from the previous version [1] are translated to C and named in the same way. The structure of all programs is identical. We have introduced the use of comprehensive input files, where all parameters are explained in detail and can be set by a user. We have also included makefiles with tested and verified settings for GNU's gcc compiler, Intel's icc compiler, IBM's xlc compiler, PGI's pgcc compiler, and Oracle's suncc (former Sun's) compiler. In addition to this, 6 new threaded (OpenMP parallelized) programs are supplied (imagtime2d-th, imagtime3d-th, imagtimeaxial-th, realtime2d-th, realtime3d-th, realtimeaxial-th) for algorithms involving two or three space variables. They are written by OpenMP-parallelizing the most computationally demanding loops in functions performing time evolution (calcnu, calclux, calcluy, calcluz), normalization (calcnorm), and calculation of physical quantities (calcmuen, calcrms). Since some of the dynamically allocated array variables are used within such loops, they had to be made private for each thread. This was done by allocating matrices instead of arrays, with the first index in all such matrices corresponding to a thread number. Additional comments: This package consists of 18 programs, see Program title above, out of which 12 programs (i, ii, iv, v, vii, ix, x, xi, xiii, xiv, xvi, xviii) are serial, while 6 programs (iii, vi, viii, xii, xv, xvii) are threaded (OpenMP parallelized). For the particular purpose of each program, please see descriptions below. Running time: All running times given in descriptions below refer to programs compiled with gcc on quad-core Intel Xeon X5460 at 3.16 GHz (CPU1), and programs compiled with icc on quad-core Intel Nehalem E5540 at 2.53 GHz (CPU2). With the supplied input files, running times on CPU1 are: 5 min (i, iv, ix, xii, xiii, xvii, xviii), 10 min (viii, xvi), 15 min (iii, x, xi), 30 min (ii, vi, vii), 2 h (v), 4 h (xv), 15 h (xiv). On CPU2, running times are: 5 min (i, iii, iv, viii, ix, xii, xiii, xvi, xvii, xviii), 10 min (vi, x, xi), 20 min (ii, vii), 1 h (v), 2 h (xv), 12 h (xiv).
- Publication:
-
Computer Physics Communications
- Pub Date:
- September 2012
- DOI:
- arXiv:
- arXiv:1206.1361
- Bibcode:
- 2012CoPhC.183.2021V
- Keywords:
-
- Condensed Matter - Quantum Gases;
- High Energy Physics - Theory;
- Physics - Computational Physics;
- Quantum Physics
- E-Print:
- 8 pages, 1 figure