quantum-espresso/FFTXlib
Ye Luo 2855812998 Use fftw3 f03 APIs 2021-06-02 03:19:21 -05:00
..
examples execution order of the ffts in the scalar case (cftt3ds) changed from 2016-07-07 15:44:16 +00:00
tests Fixed bug in many_cft3s_gpu for 'Rho' case. Activated batched Rho fft together with tests in test_fwinv_gpu. 2018-11-05 15:23:18 +01:00
CMakeLists.txt Add unit test runner. 2021-05-23 09:58:06 -05:00
FFTXlib.md Added FORD documentation for test.f90 2016-08-05 09:14:08 +00:00
Makefile More ARMLib FFT removal 2021-02-13 10:07:35 +01:00
README.md More ARMLib FFT removal 2021-02-13 10:07:35 +01:00
fft_buffers.f90 Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00
fft_error.f90 Typo correction. 2021-01-24 10:24:27 -06:00
fft_fwinv.f90 The definitions of "backward" and "forward" FFT in on-line comments was 2021-03-27 10:29:08 +01:00
fft_ggen.f90 Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00
fft_helper_subroutines.f90 Merge branch 'develop' into gpu-develop 2020-11-30 10:50:14 +01:00
fft_interfaces.f90 Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00
fft_interpolate.f90 define a generic interface for fft_interpolate and move the correspondig routine to FFTXlib 2018-01-08 23:02:08 +01:00
fft_parallel.f90 Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00
fft_parallel_2d.f90 Aligned gpu-develop branch with develop from QEF/q-e. 2021-01-19 22:16:25 +00:00
fft_param.f90 Fix error handling in FFTXlib 2021-01-23 15:56:47 -06:00
fft_scalar.DFTI.f90 Merge branch 'pdos' into 'develop' 2021-03-27 10:04:56 +00:00
fft_scalar.ESSL.f90 The definitions of "backward" and "forward" FFT in on-line comments was 2021-03-27 10:29:08 +01:00
fft_scalar.FFTW.f90 The definitions of "backward" and "forward" FFT in on-line comments was 2021-03-27 10:29:08 +01:00
fft_scalar.FFTW3.f90 Use fftw3 f03 APIs 2021-06-02 03:19:21 -05:00
fft_scalar.SX6.f90 The definitions of "backward" and "forward" FFT in on-line comments was 2021-03-27 10:29:08 +01:00
fft_scalar.cuFFT.f90 Bury unused cfft3ds_gpu implementaton in history. 2021-05-20 14:54:20 -05:00
fft_scalar.f90 Obsolete FFTs from ARMlib removed 2021-02-13 09:57:45 +01:00
fft_scatter.f90 FFTXlib cleanup (?) 2020-09-23 13:18:15 +02:00
fft_scatter_2d.f90 Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00
fft_scatter_2d_gpu.f90 Aligned gpu-develop branch with develop from QEF/q-e. 2021-01-19 22:16:25 +00:00
fft_scatter_gpu.f90 Aligned gpu-develop branch with develop from QEF/q-e. 2021-01-19 22:16:25 +00:00
fft_smallbox.f90 MAJOR restructuring of the FFTXlib library 2017-08-01 20:31:02 +00:00
fft_smallbox_type.f90 MAJOR restructuring of the FFTXlib library 2017-08-01 20:31:02 +00:00
fft_stick.c Fix FFTXLIB NVHPC warning. 2021-05-02 14:43:41 -05:00
fft_support.f90 Minor changes to FFTXlib. The only change (marginally) affecting 2019-04-25 21:06:44 +02:00
fft_test.f90 Avoid timers from qe_utilx. 2021-05-08 16:05:56 -05:00
fft_types.f90 making destroy of desc%a2a_comp safer 2021-01-29 15:35:30 +01:00
fftw.c - internal new single precision fft driver 2019-05-25 08:39:53 +02:00
fftw.h - internal new single precision fft driver 2019-05-25 08:39:53 +02:00
fftw_dp.c Fix FFTXLIB NVHPC warning. 2021-05-02 14:43:41 -05:00
fftw_dp.h - internal new single precision fft driver 2019-05-25 08:39:53 +02:00
fftw_interfaces.f90 Missing interfaces for FFTW + misspells, noticed by Anton 2017-08-08 14:17:21 +00:00
fftw_sp.c Fix FFTXLIB NVHPC warning. 2021-05-02 14:43:41 -05:00
fftw_sp.h - internal new single precision fft driver 2019-05-25 08:39:53 +02:00
gen_test_params.py Convert to Python3 and minor improvements 2020-09-23 23:50:24 +09:00
konst.h - FFT Modules replaced by FFTXlib 2015-11-21 10:37:48 +00:00
make.depend Obsolete FFTs from ARMlib removed 2021-02-13 09:57:45 +01:00
scatter_mod.f90 FFTXlib cleanup (?) 2020-09-23 13:18:15 +02:00
stick_base.f90 a forgotten deallocation created a serious memory leak when fft is initialized several times. 2017-08-24 15:29:52 +00:00
tg_gather.f90 Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00

README.md

FFTXlib

Implements real space grid parallelization of FFT and task groups.

Testing and Benchmarking

This library also provides a testing and timing code to asses the performance of your FFT, estimate the scalability and the optimal parameters for your simulation.

To compile the test program, once you have properly configure QE within a parallel environment, go inside the directory FFTXlib and type:

make TEST

Then you can run your FFT tests using command like:

mpirun -np 4 ./fft_test.x -ecutwfc 80 -alat 20  -nbnd 128 -ntg 4

Command line arguments:

-ecutwfc  Plane wave energy cut off
-alat     Lattice parameter (for hard coded lattice structure)
-nbnd     Number of bands (fft cycles)
-ntg      Number of task groups
-av1  x y z    First lattice vector, in atomic units. N.B.: when using -av1, -alat is ignored!
-av2  x y z    Second lattice vector, in atomic units. N.B.: when using -av2, -alat is ignored!
-av3  x y z    Third lattice vector, in atomic units. N.B.: when using -av3, -alat is ignored!
-kmax kx ky kz    Reciprocal lattice vector inside the BZ with maximum norm. Used to calculate max(|G+K|). (2pi/a)^2 units.

A python script to extract the parameters from an output file of pw.x is also available. Example usage:

$ python gen_test_params.py a_pw_output
To analize performances run with:
mpirun -np X ./fft_test.x -ntg Y -ecutwfc 36.7500 -ecutrho 147.0000 -av1 36.6048 0.0 0.0 -av2 -18.3024 31.70067192 0.0 -av3 0.0 0.0 18.3024 -nbnd 400 -gamma .true.

Replace X and Y with appropriate values for your simualtion.

Files

Compile time parameters:

fft_param.f90

Descriptor types:

stick_base.f90
fft_types.f90
fft_smallbox_type.f90

Parallel execution routines:

fft_interfaces.f90 fft_fwinv.f90
  fft_parallel.f90
    scatter_mod.f90
      tg_gather.f90
  fft_interpolate.f90
  fft_smallbox.f90

Low level library wrappers:

fft_scalar.f90
fft_scalar.DFTI.f90
fft_scalar.ESSL.f90
fft_scalar.FFTW.f90 fftw_interfaces.f90
  fft_stick.c fftw.c fftw_dp.c fftw_sp.c fftw_dp.h fftw.h fftw_sp.h konst.h
fft_scalar.FFTW3.f90
fft_scalar.SX6.f90

Misc. helper routines:

fft_ggen.f90
fft_error.f90
fft_helper_subroutines.f90
fft_support.f90

Tests:

test0.f90
test.f90