C programs for solving the timedependent GrossPitaevskii equation in a fully anisotropic trap
Abstract
We present C programming language versions of earlier published Fortran programs (Muruganandam and Adhikari (2009) [1]) for calculating both stationary and nonstationary solutions of the timedependent GrossPitaevskii (GP) equation. The GP equation describes the properties of dilute BoseEinstein condensates at ultracold temperatures. C versions of programs use the same algorithms as the Fortran ones, involving real and imaginarytime propagation based on a splitstep CrankNicolson method. In a onespacevariable form of the GP equation, we consider the onedimensional, twodimensional, circularlysymmetric, and the threedimensional sphericallysymmetric harmonicoscillator traps. In the twospacevariable form, we consider the GP equation in twodimensional anisotropic and threedimensional axiallysymmetric traps. The fullyanisotropic threedimensional GP equation is also considered. In addition to these twelve programs, for six algorithms that involve two and three space variables, we have also developed threaded (OpenMP parallelized) programs, which allow numerical simulations to use all available CPU cores on a computer. All 18 programs are optimized and accompanied by makefiles for several popular C compilers. We present typical results for scalability of threaded codes and demonstrate almost linear speedup obtained with the new programs, allowing a decrease in execution times by an order of magnitude on modern multicore computers. New version program summary Program title: GPSCL package, consisting of: (i) imagtime1d, (ii) imagtime2d, (iii) imagtime2dth, (iv) imagtimecir, (v) imagtime3d, (vi) imagtime3dth, (vii) imagtimeaxial, (viii) imagtimeaxialth, (ix) imagtimesph, (x) realtime1d, (xi) realtime2d, (xii) realtime2dth, (xiii) realtimecir, (xiv) realtime3d, (xv) realtime3dth, (xvi) realtimeaxial, (xvii) realtimeaxialth, (xviii) realtimesph. Catalogue identifier: AEDU_v2_0. Program Summary URL: http://cpc.cs.qub.ac.uk/summaries/AEDU_v2_0.html. Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland. Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html. No. of lines in distributed program, including test data, etc.: 180 583. No. of bytes in distributed program, including test data, etc.: 1 188 688. Distribution format: tar.gz. Programming language: C and C/OpenMP. Computer: Any modern computer with C language compiler installed. Operating system: Linux, Unix, Mac OS, Windows. RAM: Memory used with the supplied input files: 24 MB (i, iv, ix, x, xiii, xvi, xvii, xviii), 8 MB (xi, xii), 32 MB (vii, viii), 80 MB (ii, iii), 700 MB (xiv, xv), 1.2 GB (v, vi). Number of processors used: For threaded (OpenMP parallelized) programs, all available CPU cores on the computer. Classification: 2.9, 4.3, 4.12. Catalogue identifier of previous version: AEDU_v1_0. Journal reference of previous version: Comput. Phys. Commun. 180 (2009) 1888. Does the new version supersede the previous version?: No. Nature of problem: These programs are designed to solve the timedependent GrossPitaevskii (GP) nonlinear partial differential equation in one, two or threespace dimensions with a harmonic, circularlysymmetric, sphericallysymmetric, axiallysymmetric or fully anisotropic trap. The GP equation describes the properties of a dilute trapped BoseEinstein condensate. Solution method: The timedependent GP equation is solved by the splitstep CrankNicolson method by discretizing in space and time. The discretized equation is then solved by propagation, in either imaginary or real time, over small time steps. The method yields solutions of stationary and/or nonstationary problems. Reasons for the new version: Previous Fortran programs [1] are used within the ultracold atoms [211] and nonlinear optics [12,13] communities, as well as in various other fields [1416]. This new version represents translation of all programs to the C programming language, which will make it accessible to the wider parts of the corresponding communities. It is well known that numerical simulations of the GP equation in highly experimentally relevant geometries with two or three space variables are computationally very demanding, which presents an obstacle in detailed numerical studies of such systems. For this reason, we have developed threaded (OpenMP parallelized) versions of programs imagtime2d, imagtime3d, imagtimeaxial, realtime2d, realtime3d, realtimeaxial, which are named imagtime2dth, imagtime3dth, imagtimeaxialth, realtime2dth, realtime3dth, realtimeaxialth, respectively. Fig. 1 shows the scalability results obtained for OpenMP versions of programs realtime2d and realtime3d. As we can see, the speedup is almost linear, and on a computer with the total of 8 CPU cores we observe in Fig. 1(a) a maximal speedup of around 7, or roughly 90% of the ideal speedup, while on a computer with 12 CPU cores we find in Fig. 1(b) that the maximal speedup is around 9.6, or 80% of the ideal speedup. Such a speedup represents significant improvement in the performance. Summary of revisions: All Fortran programs from the previous version [1] are translated to C and named in the same way. The structure of all programs is identical. We have introduced the use of comprehensive input files, where all parameters are explained in detail and can be set by a user. We have also included makefiles with tested and verified settings for GNU's gcc compiler, Intel's icc compiler, IBM's xlc compiler, PGI's pgcc compiler, and Oracle's suncc (former Sun's) compiler. In addition to this, 6 new threaded (OpenMP parallelized) programs are supplied (imagtime2dth, imagtime3dth, imagtimeaxialth, realtime2dth, realtime3dth, realtimeaxialth) for algorithms involving two or three space variables. They are written by OpenMPparallelizing the most computationally demanding loops in functions performing time evolution (calcnu, calclux, calcluy, calcluz), normalization (calcnorm), and calculation of physical quantities (calcmuen, calcrms). Since some of the dynamically allocated array variables are used within such loops, they had to be made private for each thread. This was done by allocating matrices instead of arrays, with the first index in all such matrices corresponding to a thread number. Additional comments: This package consists of 18 programs, see Program title above, out of which 12 programs (i, ii, iv, v, vii, ix, x, xi, xiii, xiv, xvi, xviii) are serial, while 6 programs (iii, vi, viii, xii, xv, xvii) are threaded (OpenMP parallelized). For the particular purpose of each program, please see descriptions below. Running time: All running times given in descriptions below refer to programs compiled with gcc on quadcore Intel Xeon X5460 at 3.16 GHz (CPU1), and programs compiled with icc on quadcore Intel Nehalem E5540 at 2.53 GHz (CPU2). With the supplied input files, running times on CPU1 are: 5 min (i, iv, ix, xii, xiii, xvii, xviii), 10 min (viii, xvi), 15 min (iii, x, xi), 30 min (ii, vi, vii), 2 h (v), 4 h (xv), 15 h (xiv). On CPU2, running times are: 5 min (i, iii, iv, viii, ix, xii, xiii, xvi, xvii, xviii), 10 min (vi, x, xi), 20 min (ii, vii), 1 h (v), 2 h (xv), 12 h (xiv).
 Publication:

Computer Physics Communications
 Pub Date:
 September 2012
 DOI:
 10.1016/j.cpc.2012.03.022
 arXiv:
 arXiv:1206.1361
 Bibcode:
 2012CoPhC.183.2021V
 Keywords:

 Condensed Matter  Quantum Gases;
 High Energy Physics  Theory;
 Physics  Computational Physics;
 Quantum Physics
 EPrint:
 8 pages, 1 figure