In real space processors are organized in a 2D pattern.
Each processor owns data from a sub-set of Z-planes and a sub-set of Y-planes.
In reciprocal space each processor owns Z-columns that belong to a sub set of
X-values. This allows to split the processors in two sets for communication
in the YZ and XY planes.
In alternative, if the situation allows for it, a task group paralelization is used
(with ntg=nyfft) where complete XY planes of ntg wavefunctions are collected and Fourier
trasnformed in G space by different task-groups. This is preferable to the Z-proc + Y-proc
paralleization if task group can be used because a smaller number of larger ammounts of
data are transferred. Hence three types of fft are implemented:
!
!! ... isgn = +-1 : parallel 3d fft for rho and for the potential
!
!! ... isgn = +-2 : parallel 3d fft for wavefunctions
!
!! ... isgn = +-3 : parallel 3d fft for wavefunctions with task group
!
!! ... isgn = + : G-space to R-space, output = \sum_G f(G)exp(+iG*R)
!! ... fft along z using pencils (cft_1z)
!! ... transpose across nodes (fft_scatter_yz)
!! ... fft along y using pencils (cft_1y)
!! ... transpose across nodes (fft_scatter_xy)
!! ... fft along x using pencils (cft_1x)
!
!! ... isgn = - : R-space to G-space, output = \int_R f(R)exp(-iG*R)/Omega
!! ... fft along x using pencils (cft_1x)
!! ... transpose across nodes (fft_scatter_xy)
!! ... fft along y using pencils (cft_1y)
!! ... transpose across nodes (fft_scatter_yz)
!! ... fft along z using pencils (cft_1z)
!
! If task_group_fft_is_active the FFT acts on a number of wfcs equal to
! dfft%nproc2, the number of Y-sections in which a plane is divided.
! Data are reshuffled by the fft_scatter_tg routine so that each of the
! dfft%nproc2 subgroups (made by dfft%nproc3 procs) deals with whole planes
! of a single wavefunciton.
!
fft_type module heavily modified, a number of variables renamed with more intuitive names
(at least to me), a number of more variables introduced for the Y-proc parallelization.
Task_group module made void. task_group management is now reduced to the logical component
fft_desc%have_task_groups of fft_type_descriptor type variable fft_desc.
In term of interfaces, the 'easy' calling sequences are
SUBROUTINE invfft/fwfft( grid_type, f, dfft, howmany )
!! where:
!!
!! **grid_type = 'Dense'** :
!! inverse/direct fourier transform of potentials and charge density f
!! on the dense grid (dfftp). On output, f is overwritten
!!
!! **grid_type = 'Smooth'** :
!! inverse/direct fourier transform of potentials and charge density f
!! on the smooth grid (dffts). On output, f is overwritten
!!
!! **grid_type = 'Wave'** :
!! inverse/direct fourier transform of wave functions f
!! on the smooth grid (dffts). On output, f is overwritten
!!
!! **grid_type = 'tgWave'** :
!! inverse/direct fourier transform of wave functions f with task group
!! on the smooth grid (dffts). On output, f is overwritten
!!
!! **grid_type = 'Custom'** :
!! inverse/direct fourier transform of potentials and charge density f
!! on a custom grid (dfft_exx). On output, f is overwritten
!!
!! **grid_type = 'CustomWave'** :
!! inverse/direct fourier transform of wave functions f
!! on a custom grid (dfft_exx). On output, f is overwritten
!!
!! **dfft = FFT descriptor**, IMPORTANT NOTICE: grid is specified only by dfft.
!! No check is performed on the correspondence between dfft and grid_type.
!! grid_type is now used only to distinguish cases 'Wave' / 'CustomWave'
!! from all other cases
Many more files modified.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13676 c92efa57-630b-4861-b058-cf58834340f0
basic operations: error handling, timing clocks, interfaces to basic mpi
calls, find free units...
These routines are moved from Modules and dependencies to other modules
are removed.
MANY files are updated to comply with the move.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13629 c92efa57-630b-4861-b058-cf58834340f0
US variable qq renamed qq_nt and a new variable qq_na added
because in real space the integral may depend (slightly) on
the atomic position and an atomic value is needed to compute
exactly normalizable wfc.
Whenever realspace tricks are not used qq_nt is used.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13604 c92efa57-630b-4861-b058-cf58834340f0
precision complex, the former is a single precision complex. Not a big deal,
since we use only complex 0 or 1 or i for which there is no loss of precision.
Note however that CMPLX(a,0.d0) with "a" double precision real, or CMPLX(a,b)
are single-precision complex, and this can introduce loss of precision.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13314 c92efa57-630b-4861-b058-cf58834340f0
- old routine computing dos, allocation and deallocation of "tetra" moved into
module ktetra; variables tetra and ntetra are used only inside the module
- added module variable nntetra containing number of neighboring points used
(20 for optimized tetrahedra, 4 otherwise)
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13228 c92efa57-630b-4861-b058-cf58834340f0
Variable "ltetra" moved to common "klist" together with all other variables
setting occupations. All make.depend updated. Should be harmless.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@13204 c92efa57-630b-4861-b058-cf58834340f0
to compute many FFTs at the same time, particularly usefull for EXX
but could be usefule for many linear response code as well
(for the time being implemented only for DFTI and internal FFTW,
should be trivial to extend other drivers)
- more clean-ups
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12815 c92efa57-630b-4861-b058-cf58834340f0
- FFT type now do not have any information about task group:
no need to temporary change the value of variable...
- When task group are not needed symple do not use "dtgs" data type
- FFT interfaces called with FFT datatype ONLY, do not perform
task groups trics any longer, this should simply thing a bit....
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12628 c92efa57-630b-4861-b058-cf58834340f0
l-dependent array in all cases.
This is already the case when upf%tpawp or upf%q_with_l are .true. .
For vanderbilt US pseudos, where nqf and rinner are non zero, we do here what otherwise
would be done multiple times in many parts of the code (such as in init_us_1, addusforce_r,
bp_calc_btq, compute_qdipol) whenever the q_l(r) were to be constructed.
For simple rrkj3 pseudos we duplicate the infomation contained in q(r) for all q_l(r).
This requires a little extra memory but unifies the treatment of q_l(r) and allows further
tweaking with the augmentation charge.
Variable upf%q_with_l set .true. at the end of the operation. It would be better to leave the
variable untouched at its input value and modify the routines that compute q_l(r) to just use
the now always present upf%qfuncl array but this is the first step before some cleanup.
setqf.f90 moved from PW/src to Modules, Makefiles and dependencies updated
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12591 c92efa57-630b-4861-b058-cf58834340f0
More specificaly:
1) Remove global variable npw (from wvfct) and use ngk(ik) (for optical TDDFPT codes)
or ngk(ikk) with ikk=ikks(ik) (for turboEELS). In some routines, ngk is assigned to
the local variable npw, i.e. npw=ngk(ik), and in other routines ngk is used directly.
2) Remove global indices igk(1:3) (from wvfct) and use igk_k(1:3,ik) (for optical TDDFPT codes)
or igk_k(1:3,ikk) with ikk=ikks(ik) (for turboEELS).
3) Remove global variable npwq (from qpoint) and use the local variable with the same name,
which is defined as npwq=ngk(ikq) with ikq=ikqs(ik) (i.e. index of the point k+q).
4) Remove global index variable igkq(1:3) (from qpoint) and use the global index variable
igk_k(1:3,ikq) with ikq=ikqs(ik).
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12487 c92efa57-630b-4861-b058-cf58834340f0
BEWARE: I think I have modified all codes that needed it, but please
1) verify that both allocation and deallocation are made in the proper place
2) update other codes not under svn that make usage of such variable
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12451 c92efa57-630b-4861-b058-cf58834340f0
good idea to call "h_psi" a routine that does something related to but
different from H\psi. Corrected a few grossly wrong comments.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12440 c92efa57-630b-4861-b058-cf58834340f0
1) Moved some TDDFPT-specific cases from the general routine LR_Modules/ch_psi_all.f90 to the TDDFPT routines;
2) Deleted the variable "tddfpt", because it is no longer needed anywhere (in the older versions of the code this variable was used to tell to the PHonon routines about TDDFPT specific operations);
3) Some other minor changes in TDDFPT.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12432 c92efa57-630b-4861-b058-cf58834340f0
variables deleted; almost all occurrences of "npw" made local (PW and PP only)
Variable "current_k" must be set before calling h_psi (as before, although it
was used only in some cases). All changes should be safe, but testing of PP
and PH is very limited.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12428 c92efa57-630b-4861-b058-cf58834340f0
All of its subroutines and module have been transfered into LR_Modules/dynmat_sub.f90.
This will allow the subroutines to be re-used by other programs.
Note: I had to rename the subroutine "readmat" into "readmat2" because of
another readmat subroutine in PHonon/PH/elphon.f90.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12386 c92efa57-630b-4861-b058-cf58834340f0
prevent trouble with OS-X. May or may not work (it won't unless configure
is updated: please somebody with v.2.63 of autoconf do it), may turn out to
be obsolete anyway.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12364 c92efa57-630b-4861-b058-cf58834340f0
1) Added a flag dpsi_computed in the subroutine orthogonalize.f90 in LR_Modules. The meaning of this flag is described in the header of that routine. It was needed to generalize this subroutine. In short, this flag controls whether S*evq was already computed before or it must be computed inside of orthogonalize. If this idea does not look good for someone, then a better strategy should be found. I decided to introduce this new flag in order to make minimal changes in the PHonon code.
2) TDDFPT: The use of lr_ortho (which is a modified version of orthogonalize.f90) was replaced in several places by orthogonalize. Still in several places of the turboDavidson code there are calls to lr_ortho, which should be replaced by orthogonalize if possible. In the future lr_ortho should be removed and orthogonalize should be used instead. This is done for the sake of unification of the linear response codes in QE.
3) TDDFPT: The orthogonalization to the valence states manifold (orthogonalize) has been moved from lr_lanczos to lr_apply_liouvillian, which is now in correspondance with the definition of the Liouvillian superoperator implemented in lr_apply_liouvillian. USPP is a special case, and hence the property S^-1 P_c^+ = P_c S^-1 has been used to make such a move of the call to orthogonalize (old call to lr_ortho).
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12225 c92efa57-630b-4861-b058-cf58834340f0