Commit Graph

8 Commits

Author SHA1 Message Date
Paolo Giannozzi 5c444d5a1d The definitions of "backward" and "forward" FFT in on-line comments was
not consistent with what FFTs actually do in the code
2021-03-27 10:29:08 +01:00
Pietro 88bd4e448a Add GPU support and 1D+2D decomposition to FFTXlib 2020-10-01 19:32:07 +00:00
giacombum 1f450feec1 new FFTXlib based on OpenMP 2020-02-13 16:03:41 +00:00
Stefano de Gironcoli 492c1c052e AGAIN on the develop branch ...
variable grid_type (internal to the FFTXlib routines, never referenced outside)
renamed fft_kind to better reflect its meaning.
unused file task_groups.f90 removed
2018-02-08 14:23:10 +01:00
Stefano de Gironcoli 41e91c0dac new interface to fft calls
three types of calls are possibles :  'Rho', 'Wave', 'tgWave'

   In order to enable an fft-type for a given grid the corresponding clock_labels must be set.
   One gives a name to desc%rho_clock_lable for 'Rho' type fft and a name to
   desc%wave_clock_lable for 'Wave' and 'tgWave' types. Whether tg is
   possible depends of the already defined value of desc%have_task_groups variable (mispell to be corrected soon).

   definining
      dffts%rho_clock_label='ffts', dffts%wave_clock_label='fftw',
      dfftp%rho_clock_label='fft', dfftt%rho_clock_label='fftc' and
      dfftt%wave_clock_label='fftcw'
   and changing
      'Dense'->'Rho', 'Smooth'->'Rho', 'Custom'->'Rho', 'CustomWave'->'Wave'
   the same clock names and the same overall behavior as with the old interface is obtained.
2018-01-02 17:45:45 +01:00
giannozz bb112e77a8 __OPENMP => _OPENMP (set by all OpenMP-aware compilers)
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13951 c92efa57-630b-4861-b058-cf58834340f0
2017-10-25 07:32:10 +00:00
degironc 3e6b4f8e76 MAJOR restructuring of the FFTXlib library
In real space processors are organized in a 2D pattern.

Each processor owns data from a sub-set of Z-planes and a sub-set of Y-planes.
In reciprocal space each processor owns Z-columns that belong to a sub set of
X-values. This allows to split the processors in two sets for communication
in the YZ and XY planes.
In alternative, if the situation allows for it, a task group paralelization is used
(with ntg=nyfft) where complete XY planes of ntg wavefunctions are collected and Fourier
trasnformed in G space by different task-groups. This is preferable to the Z-proc + Y-proc
paralleization if task group can be used because a smaller number of larger ammounts of 
data are transferred. Hence three types of fft are implemented: 
 
  !
  !! ... isgn = +-1 : parallel 3d fft for rho and for the potential
  !
  !! ... isgn = +-2 : parallel 3d fft for wavefunctions
  !
  !! ... isgn = +-3 : parallel 3d fft for wavefunctions with task group
  !
  !! ... isgn = +   : G-space to R-space, output = \sum_G f(G)exp(+iG*R)
  !! ...              fft along z using pencils        (cft_1z)
  !! ...              transpose across nodes           (fft_scatter_yz)
  !! ...              fft along y using pencils        (cft_1y)
  !! ...              transpose across nodes           (fft_scatter_xy)
  !! ...              fft along x using pencils        (cft_1x)
  !
  !! ... isgn = -   : R-space to G-space, output = \int_R f(R)exp(-iG*R)/Omega
  !! ...              fft along x using pencils        (cft_1x)
  !! ...              transpose across nodes           (fft_scatter_xy)
  !! ...              fft along y using pencils        (cft_1y)
  !! ...              transpose across nodes           (fft_scatter_yz)
  !! ...              fft along z using pencils        (cft_1z)
  !
  ! If task_group_fft_is_active the FFT acts on a number of wfcs equal to 
  ! dfft%nproc2, the number of Y-sections in which a plane is divided. 
  ! Data are reshuffled by the fft_scatter_tg routine so that each of the 
  ! dfft%nproc2 subgroups (made by dfft%nproc3 procs) deals with whole planes 
  ! of a single wavefunciton.
  !

fft_type module heavily modified, a number of variables renamed with more intuitive names 
(at least to me), a number of more variables introduced for the Y-proc parallelization.

Task_group module made void. task_group management is now reduced to the logical component
 fft_desc%have_task_groups of fft_type_descriptor type variable fft_desc.

In term of interfaces, the 'easy' calling sequences are

SUBROUTINE invfft/fwfft( grid_type, f, dfft, howmany )

  !! where:
  !! 
  !! **grid_type = 'Dense'** : 
  !!   inverse/direct fourier transform of potentials and charge density f
  !!   on the dense grid (dfftp). On output, f is overwritten
  !! 
  !! **grid_type = 'Smooth'** :
  !!   inverse/direct fourier transform of  potentials and charge density f
  !!   on the smooth grid (dffts). On output, f is overwritten
  !! 
  !! **grid_type = 'Wave'** :
  !!   inverse/direct fourier transform of  wave functions f
  !!   on the smooth grid (dffts). On output, f is overwritten
  !!
  !! **grid_type = 'tgWave'** :
  !!   inverse/direct fourier transform of  wave functions f with task group
  !!   on the smooth grid (dffts). On output, f is overwritten
  !!
  !! **grid_type = 'Custom'** : 
  !!   inverse/direct fourier transform of potentials and charge density f
  !!   on a custom grid (dfft_exx). On output, f is overwritten
  !! 
  !! **grid_type = 'CustomWave'** :
  !!   inverse/direct fourier transform of  wave functions f
  !!   on a custom grid (dfft_exx). On output, f is overwritten
  !! 
  !! **dfft = FFT descriptor**, IMPORTANT NOTICE: grid is specified only by dfft.
  !!   No check is performed on the correspondence between dfft and grid_type.
  !!   grid_type is now used only to distinguish cases 'Wave' / 'CustomWave' 
  !!   from all other cases
                                                                                                 

Many more files modified.




git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13676 c92efa57-630b-4861-b058-cf58834340f0
2017-08-01 20:31:02 +00:00
giannozz 229692d720 Cleanup of FFTXlib:
1) routines fwfft and invfft moved out of file fft_interfaces.f90 into new
   file fft_fwinv.f90. Prevents massive recompilation if something changes
   in the FFT routines.
2) machine-dependent fft_scalar.*.f90 are now modules with different names,
   conditionally included into fft_scalar with a USE, no longer an #include.
   Avoids trouble with dependencies, allow simplification of makedeps.sh.
All changes should be harmless, but I have tested only FFTW, FFTW3, DFTI.
Please let me know if there is any problem



git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13185 c92efa57-630b-4861-b058-cf58834340f0
2016-11-27 21:43:15 +00:00