mirror of https://github.com/abinit/abinit.git
165 lines
8.8 KiB
Plaintext
165 lines
8.8 KiB
Plaintext
|
|
.Version 9.3.1 of FFTPROF
|
|
.(MPI version, prepared for a x86_64_linux_gnu9.3 computer)
|
|
|
|
.Copyright (C) 1998-2025 ABINIT group .
|
|
FFTPROF comes with ABSOLUTELY NO WARRANTY.
|
|
It is free software, and you are welcome to redistribute it
|
|
under certain conditions (GNU General Public License,
|
|
see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt).
|
|
|
|
ABINIT is a project of the Universite Catholique de Louvain,
|
|
Corning Inc. and other collaborators, see ~abinit/doc/developers/contributors.txt .
|
|
Please read https://docs.abinit.org/theory/acknowledgments for suggested
|
|
acknowledgments of the ABINIT effort.
|
|
For more information, see https://www.abinit.org .
|
|
|
|
.Starting date : Tue 10 Nov 2020.
|
|
- ( at 22h24 )
|
|
|
|
Tool for profiling and testing the FFT libraries used in ABINIT.
|
|
Allowed options are:
|
|
fourdp --> Test FFT transforms of density and potentials on the full box.
|
|
fourwf --> Test FFT transforms of wavefunctions using the zero-pad algorithm.
|
|
gw_fft --> Test the FFT transforms used in the GW code.
|
|
all --> Test all FFT routines.
|
|
|
|
Real(R)+Recip(G) space primitive vectors, cartesian coordinates (Bohr,Bohr^-1):
|
|
R(1)= 20.0000000 0.0000000 0.0000000 G(1)= 0.0500000 0.0000000 0.0000000
|
|
R(2)= 0.0000000 20.0000000 0.0000000 G(2)= 0.0000000 0.0500000 0.0000000
|
|
R(3)= 0.0000000 0.0000000 20.0000000 G(3)= 0.0000000 0.0000000 0.0500000
|
|
Unit cell volume ucvol= 8.0000000E+03 bohr^3
|
|
Unit cell volume ucvol= 8.0000000E+03 bohr^3
|
|
Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees
|
|
Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees
|
|
|
|
==== FFT setup for fftalg 110 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 110
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 111 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 111
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 112 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 112
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 410 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 410
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 411 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 411
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 412 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 412
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 312 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 312
|
|
FFT cache size ............................ 16
|
|
|
|
==== FFT setup for fftalg 512 ====
|
|
FFT mesh divisions ........................ 100 100 100
|
|
Augmented FFT divisions ................... 101 101 100
|
|
FFT algorithm ............................. 512
|
|
FFT cache size ............................ 16
|
|
|
|
==============================================================
|
|
==== fourwf with option 0, cplex 0, ndat 1, istwf_k 1 ====
|
|
==============================================================
|
|
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
|
|
- Goedecker (110) 0.0286 0.0286 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (111) 0.0189 0.0190 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (112) 0.0162 0.0163 1 (100%) 5 5.78E-14 1.76E-15
|
|
- Goedecker2002 (410) 0.0501 0.0503 1 (100%) 5 6.02E-14 1.79E-15
|
|
- Goedecker2002 (411) 0.0197 0.0198 1 (100%) 5 6.02E-14 1.79E-15
|
|
- Goedecker2002 (412) 0.0194 0.0194 1 (100%) 5 6.02E-14 1.79E-15
|
|
- FFTW3 (312) 0.0115 0.0115 1 (100%) 5 9.45E-14 2.09E-15
|
|
- DFTI (512) N/A N/A N/A N/A N/A N/A
|
|
|
|
Consistency check: MAX(Max_|Err|) = 9.45E-14, Max(<|Err|>) = 2.09E-15, reference_lib: Goedecker (110)
|
|
|
|
|
|
==============================================================
|
|
==== fourwf with option 1, cplex 1, ndat 1, istwf_k 1 ====
|
|
==============================================================
|
|
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
|
|
- Goedecker (110) 0.0271 0.0272 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (111) 0.0179 0.0180 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (112) 0.0158 0.0159 1 (100%) 5 1.82E-11 1.28E-14
|
|
- Goedecker2002 (410) 0.0504 0.0506 1 (100%) 5 2.18E-11 1.37E-14
|
|
- Goedecker2002 (411) 0.0212 0.0213 1 (100%) 5 2.18E-11 1.37E-14
|
|
- Goedecker2002 (412) 0.0196 0.0196 1 (100%) 5 2.18E-11 1.37E-14
|
|
- FFTW3 (312) 0.0122 0.0122 1 (100%) 5 2.55E-11 1.79E-14
|
|
- DFTI (512) N/A N/A N/A N/A N/A N/A
|
|
|
|
Consistency check: MAX(Max_|Err|) = 2.55E-11, Max(<|Err|>) = 1.79E-14, reference_lib: Goedecker (110)
|
|
|
|
|
|
==============================================================
|
|
==== fourwf with option 2, cplex 1, ndat 1, istwf_k 1 ====
|
|
==============================================================
|
|
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
|
|
- Goedecker (110) 0.0576 0.0578 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (111) 0.0358 0.0359 1 (100%) 5 3.33E-16 1.66E-19
|
|
- Goedecker (112) 0.0302 0.0303 1 (100%) 5 1.67E-16 2.16E-19
|
|
- Goedecker2002 (410) 0.0882 0.0884 1 (100%) 5 1.69E-16 2.30E-19
|
|
- Goedecker2002 (411) 0.0390 0.0391 1 (100%) 5 1.69E-16 2.30E-19
|
|
- Goedecker2002 (412) 0.0354 0.0355 1 (100%) 5 1.69E-16 2.30E-19
|
|
- FFTW3 (312) 0.0188 0.0188 1 (100%) 5 3.37E-16 3.03E-19
|
|
- DFTI (512) N/A N/A N/A N/A N/A N/A
|
|
|
|
Consistency check: MAX(Max_|Err|) = 3.37E-16, Max(<|Err|>) = 3.03E-19, reference_lib: Goedecker (110)
|
|
|
|
|
|
==============================================================
|
|
==== fourwf with option 3, cplex 0, ndat 1, istwf_k 1 ====
|
|
==============================================================
|
|
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
|
|
- Goedecker (110) 0.0256 0.0257 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (111) 0.0149 0.0149 1 (100%) 5 1.11E-16 5.28E-20
|
|
- Goedecker (112) 0.0146 0.0146 1 (100%) 5 1.11E-16 5.28E-20
|
|
- Goedecker2002 (410) 0.0381 0.0382 1 (100%) 5 1.12E-16 4.72E-20
|
|
- Goedecker2002 (411) 0.0184 0.0184 1 (100%) 5 1.12E-16 4.72E-20
|
|
- Goedecker2002 (412) 0.0179 0.0180 1 (100%) 5 1.12E-16 4.72E-20
|
|
- FFTW3 (312) 0.0082 0.0083 1 (100%) 5 1.67E-16 6.44E-20
|
|
- DFTI (512) N/A N/A N/A N/A N/A N/A
|
|
|
|
Consistency check: MAX(Max_|Err|) = 1.67E-16, Max(<|Err|>) = 6.44E-20, reference_lib: Goedecker (110)
|
|
|
|
|
|
==============================================================
|
|
==== fourwf with option 2, cplex 2, ndat 1, istwf_k 1 ====
|
|
==============================================================
|
|
Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|>
|
|
- Goedecker (110) 0.0494 0.0495 1 (100%) 5 0.00E+00 0.00E+00
|
|
- Goedecker (111) 0.0318 0.0319 1 (100%) 5 2.23E-16 1.92E-19
|
|
- Goedecker (112) 0.0281 0.0281 1 (100%) 5 2.24E-16 2.74E-19
|
|
- Goedecker2002 (410) 0.0861 0.0864 1 (100%) 5 2.26E-16 2.84E-19
|
|
- Goedecker2002 (411) 0.0391 0.0392 1 (100%) 5 2.26E-16 2.84E-19
|
|
- Goedecker2002 (412) 0.0368 0.0369 1 (100%) 5 2.26E-16 2.84E-19
|
|
- FFTW3 (312) 0.0200 0.0201 1 (100%) 5 4.60E-16 3.64E-19
|
|
- DFTI (512) N/A N/A N/A N/A N/A N/A
|
|
|
|
Consistency check: MAX(Max_|Err|) = 4.60E-16, Max(<|Err|>) = 3.64E-19, reference_lib: Goedecker (110)
|
|
|
|
|
|
Analysis completed.
|