projectors in USPPs. Hopefully this will allow one to study larger systems.
The modifications are done primarily keeping TDDFPT code in mind
(a branch of QE, you may see detailed explanation in qe-forge which I am trying
to keep tightly integrated). Please do not modify/beautify/make more elegant
the corresponding subroutines without prior notice, due to their dependencies.
I have tested that the current modifications do not alter the behaviour of pw.x
other than designed with a number of small tests in HG1.
Some Pointers:
-All the new subroutines reside in PW/realus.f90
-A new flag real_space in &electrons control the implementation
-tqr flag is treated seperately.
-The implementation works only for (serial) gamma point single point calculations.
ToDo:
-I have written K point and task groups implementations of most of the corresponding routines, but did not have time to implement.
-Parallelism issues are still to be checked.
-The discrepancy in total energy is <0.002 eV for cutoff of 55Ry/550Ry however,
there are some strange force components. I do not know how this will effect
a possible optimization scheme.
Other:
Trying the compile CVS version in HG1 of sissa, using the "default" compiler
sets, I encountered a very strange compiler bug. Please have a look at
Modules/read_cards.f90 for details. Remove the stupid workaround to your liking.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@5493 c92efa57-630b-4861-b058-cf58834340f0
to prevent numerical trouble. Changed to set the output h*psi(G=0) to be real:
it should be equivalent and it is much more logical. Just in case, and in order
to have the same behavior, Im [psi(G=0)] is set to 0 before calls to h_psi
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@5427 c92efa57-630b-4861-b058-cf58834340f0
Note that both the electric enthalpy term and the noncollinear routines
are called by h_psi and s_psi. Changes should be harmless.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4470 c92efa57-630b-4861-b058-cf58834340f0
data. Matrixes are distributed across processors.
- to turn on the use of the new algorithm, a new value for
input parameter "diagonalization", has been introduced:
diagonalization = 'david+distpara'
work like david but use fully distributed memory
iteration loop. Allocated memory scale down with the
number of procs. Procs involved in diagonalization can
be changed with input parameter "ortho_para".
On multi core/CPUs often it is convenient to let only
one core per CPU to work on linear algebra.
User can tune the number of core involved in diag. with
the keyword in electrons namelist:
ortho_para = NN
then the code will use the largest square smaller than NN,
since matrixes are always distributed on a suqre grid of procs.
Note that if NN < 2*nproc, then one proc every two is taken
for parellel diag. The same for NN < 4*proc, one every four
is taken. This is to minimize memory contention on multi core proc.
In example, if you run with 64procs on a 4core CPU cluster,
it may be convenient to specify:
ortho_para = 16
So that only one core per CPU is involved in diagonalization.
Further performance enhancements will be possible using
OpenMP BLAS inside regter/cegter/rdiaghg/cdiaghg (to be implemented)
for the time been, all this is only for gamma_only calculation,
ceghter will follow...
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4154 c92efa57-630b-4861-b058-cf58834340f0
for Cholesky decomposition. These routines are used to parallelize the solvers
for the generalized eigenvalue problems, namely rdiaghg and cdiaghg (notice that
the inversion of the lower triangular matrix L is still done using a serial
lapack routine). These two routines are now used for Davidson parallelization.
Old algorithm based on the orthogonalization of the correction vectors has been
removed (it was awfully slow). The performance of the new algorithm should be
decent. Beware unexpected side effects. C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@3310 c92efa57-630b-4861-b058-cf58834340f0
checks whether or not it is convenient to use the parallel diagonalizer reporting the result.
The output is still verbose to facilitate the identification of bugs.
C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@2767 c92efa57-630b-4861-b058-cf58834340f0
1) f_defs.h for definitions to be included in FORTRAN files ONLY
2) c_defs.h for definitions to be included in C files ONLY
C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@1012 c92efa57-630b-4861-b058-cf58834340f0
correct place (kinds); module "varie" replaced by "control_flags" (not
yet in pwcom, though) - many many files changed.
64-bit cpus (Opteron, maybe Itanium) should now work if __LINUX64 is defined
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@513 c92efa57-630b-4861-b058-cf58834340f0