abinit/tests/unitary/Refs/tfourwf_04.stdout

155 lines
8.2 KiB
Plaintext

.Version 9.3.1 of FFTPROF
.(MPI version, prepared for a x86_64_linux_gnu9.3 computer)
.Copyright (C) 1998-2025 ABINIT group .
FFTPROF comes with ABSOLUTELY NO WARRANTY.
It is free software, and you are welcome to redistribute it
under certain conditions (GNU General Public License,
see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt).
ABINIT is a project of the Universite Catholique de Louvain,
Corning Inc. and other collaborators, see ~abinit/doc/developers/contributors.txt .
Please read https://docs.abinit.org/theory/acknowledgments for suggested
acknowledgments of the ABINIT effort.
For more information, see https://www.abinit.org .
.Starting date : Tue 10 Nov 2020.
- ( at 22h24 )
Tool for profiling and testing the FFT libraries used in ABINIT.
Allowed options are:
fourdp --> Test FFT transforms of density and potentials on the full box.
fourwf --> Test FFT transforms of wavefunctions using the zero-pad algorithm.
gw_fft --> Test the FFT transforms used in the GW code.
all --> Test all FFT routines.
Real(R)+Recip(G) space primitive vectors, cartesian coordinates (Bohr,Bohr^-1):
R(1)= 20.0000000 0.0000000 0.0000000 G(1)= 0.0500000 0.0000000 0.0000000
R(2)= 0.0000000 20.0000000 0.0000000 G(2)= 0.0000000 0.0500000 0.0000000
R(3)= 0.0000000 0.0000000 20.0000000 G(3)= 0.0000000 0.0000000 0.0500000
Unit cell volume ucvol= 8.0000000E+03 bohr^3
Unit cell volume ucvol= 8.0000000E+03 bohr^3
Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees
Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees
==== FFT setup for fftalg 110 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 110
FFT cache size ............................ 16
==== FFT setup for fftalg 111 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 111
FFT cache size ............................ 16
==== FFT setup for fftalg 112 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 112
FFT cache size ............................ 16
==== FFT setup for fftalg 411 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 411
FFT cache size ............................ 16
==== FFT setup for fftalg 412 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 412
FFT cache size ............................ 16
==== FFT setup for fftalg 312 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 312
FFT cache size ............................ 16
==== FFT setup for fftalg 512 ====
FFT mesh divisions ........................ 100 100 100
Augmented FFT divisions ................... 101 101 100
FFT algorithm ............................. 512
FFT cache size ............................ 16
- Will use non-blocking ialltoall for MPI-FFT
==============================================================
==== fourwf with option 0, cplex 0, ndat 4, istwf_k 1 ====
==============================================================
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
- Goedecker (110) 0.1188 0.1192 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (111) 0.0906 0.0909 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (112) 0.0682 0.0684 1 (100%) 2 5.78E-14 1.76E-15
- Goedecker2002 (411) 0.0863 0.0865 1 (100%) 2 6.02E-14 1.79E-15
- Goedecker2002 (412) 0.0840 0.0843 1 (100%) 2 6.02E-14 1.79E-15
- FFTW3 (312) 0.0492 0.0494 1 (100%) 2 9.45E-14 2.09E-15
- DFTI (512) N/A N/A N/A N/A N/A N/A
Consistency check: MAX(Max_|Err|) = 9.45E-14, Max(<|Err|>) = 2.09E-15, reference_lib: Goedecker (110)
==============================================================
==== fourwf with option 1, cplex 1, ndat 4, istwf_k 1 ====
==============================================================
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
- Goedecker (110) 0.1153 0.1158 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (111) 0.0804 0.0807 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (112) 0.0622 0.0624 1 (100%) 2 7.28E-11 1.28E-14
- Goedecker2002 (411) 0.0858 0.0860 1 (100%) 2 8.73E-11 1.37E-14
- Goedecker2002 (412) 0.0792 0.0795 1 (100%) 2 8.73E-11 1.37E-14
- FFTW3 (312) 0.0474 0.0476 1 (100%) 2 1.02E-10 1.79E-14
- DFTI (512) N/A N/A N/A N/A N/A N/A
Consistency check: MAX(Max_|Err|) = 1.02E-10, Max(<|Err|>) = 1.79E-14, reference_lib: Goedecker (110)
==============================================================
==== fourwf with option 2, cplex 1, ndat 4, istwf_k 1 ====
==============================================================
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
- Goedecker (110) 0.2133 0.2140 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (111) 0.1362 0.1367 1 (100%) 2 3.33E-16 1.66E-19
- Goedecker (112) 0.1014 0.1017 1 (100%) 2 1.67E-16 2.16E-19
- Goedecker2002 (411) 0.1502 0.1507 1 (100%) 2 1.69E-16 2.30E-19
- Goedecker2002 (412) 0.1416 0.1420 1 (100%) 2 1.69E-16 2.30E-19
- FFTW3 (312) 0.0766 0.0769 1 (100%) 2 3.37E-16 3.03E-19
- DFTI (512) N/A N/A N/A N/A N/A N/A
Consistency check: MAX(Max_|Err|) = 3.37E-16, Max(<|Err|>) = 3.03E-19, reference_lib: Goedecker (110)
==============================================================
==== fourwf with option 3, cplex 0, ndat 4, istwf_k 1 ====
==============================================================
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
- Goedecker (110) 0.1007 0.1011 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (111) 0.0640 0.0642 1 (100%) 2 1.11E-16 5.28E-20
- Goedecker (112) 0.0640 0.0642 1 (100%) 2 1.11E-16 5.28E-20
- Goedecker2002 (411) 0.0709 0.0711 1 (100%) 2 1.12E-16 4.72E-20
- Goedecker2002 (412) 0.0680 0.0682 1 (100%) 2 1.12E-16 4.72E-20
- FFTW3 (312) 0.0315 0.0316 1 (100%) 2 1.67E-16 6.44E-20
- DFTI (512) N/A N/A N/A N/A N/A N/A
Consistency check: MAX(Max_|Err|) = 1.67E-16, Max(<|Err|>) = 6.44E-20, reference_lib: Goedecker (110)
==============================================================
==== fourwf with option 2, cplex 2, ndat 4, istwf_k 1 ====
==============================================================
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
- Goedecker (110) 0.2118 0.2126 1 (100%) 2 0.00E+00 0.00E+00
- Goedecker (111) 0.1378 0.1383 1 (100%) 2 2.23E-16 1.92E-19
- Goedecker (112) 0.1073 0.1078 1 (100%) 2 2.24E-16 2.74E-19
- Goedecker2002 (411) 0.1515 0.1520 1 (100%) 2 2.26E-16 2.84E-19
- Goedecker2002 (412) 0.1431 0.1436 1 (100%) 2 2.26E-16 2.84E-19
- FFTW3 (312) 0.0702 0.0705 1 (100%) 2 4.60E-16 3.64E-19
- DFTI (512) N/A N/A N/A N/A N/A N/A
Consistency check: MAX(Max_|Err|) = 4.60E-16, Max(<|Err|>) = 3.64E-19, reference_lib: Goedecker (110)
Analysis completed.