2) restart in parallel case was likely broken because unit 'iunres'
was not set to its correct value for all processors
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4192 c92efa57-630b-4861-b058-cf58834340f0
is now possible to run the PH code for just one k point out of the
full dispersion grid (specified by nq1, nq2, nq3). This way there is
no need to run PW with 'phonon' first for a non-Gamma k point, and
perhaps, there are some other positive (or negative?) effects that I
am not aware of.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4191 c92efa57-630b-4861-b058-cf58834340f0
each configuration. Allows to make simultaneously spin-polarized and
spin-unpolarized tests.
frozen_core. If true the code make frozen-core transferability tests.
In the all-electron calculation the core wavefunctions of the first configuration are kept fixed.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4190 c92efa57-630b-4861-b058-cf58834340f0
with questionable spelling
fftdrv: use the same scatter algorithm as PWscf (what_scatter=1)
the previous default, what_scatter=0, did not allow nr3x to differ from nr3
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4188 c92efa57-630b-4861-b058-cf58834340f0
The old GGA is still available compiling with the flag __OLD_NONCOLIN_GGA.
This routine is experimental.
(Thanks to G. Sclauzero for useful discussion).
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4158 c92efa57-630b-4861-b058-cf58834340f0
the save attribute:
"ld1inc.f90", line 8.0: 1513-191 (S) A variable declared in the scope of a module, grid, that is of a derived type with default initialization, must have the SAVE attribute.
???
I've added the save attribute, it does not hurt anyway.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4156 c92efa57-630b-4861-b058-cf58834340f0
data. Matrixes are distributed across processors.
- to turn on the use of the new algorithm, a new value for
input parameter "diagonalization", has been introduced:
diagonalization = 'david+distpara'
work like david but use fully distributed memory
iteration loop. Allocated memory scale down with the
number of procs. Procs involved in diagonalization can
be changed with input parameter "ortho_para".
On multi core/CPUs often it is convenient to let only
one core per CPU to work on linear algebra.
User can tune the number of core involved in diag. with
the keyword in electrons namelist:
ortho_para = NN
then the code will use the largest square smaller than NN,
since matrixes are always distributed on a suqre grid of procs.
Note that if NN < 2*nproc, then one proc every two is taken
for parellel diag. The same for NN < 4*proc, one every four
is taken. This is to minimize memory contention on multi core proc.
In example, if you run with 64procs on a 4core CPU cluster,
it may be convenient to specify:
ortho_para = 16
So that only one core per CPU is involved in diagonalization.
Further performance enhancements will be possible using
OpenMP BLAS inside regter/cegter/rdiaghg/cdiaghg (to be implemented)
for the time been, all this is only for gamma_only calculation,
ceghter will follow...
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4154 c92efa57-630b-4861-b058-cf58834340f0
data. Matrixes are distributed across processors.
- to turn on the use of the new algorithm, a new value for
input parameter "diagonalization", has been introduced:
diagonalization = 'david+distpara'
work like david but use fully distributed memory
iteration loop. Allocated memory scale down with the
number of procs. Procs involved in diagonalization can
be changed with input parameter "ortho_para".
On multi core/CPUs often it is convenient to let only
one core per CPU to work on linear algebra.
User can tune the number of core involved in diag. with
the keyword in electrons namelist:
ortho_para = NN
then the code will use the largest square smaller than NN,
since matrixes are always distributed on a suqre grid of procs.
Note that if NN < 2*nproc, then one proc every two is taken
for parellel diag. The same for NN < 4*proc, one every four
is taken. This is to minimize memory contention on multi core proc.
In example, if you run with 64procs on a 4core CPU cluster,
it may be convenient to specify:
ortho_para = 16
So that only one core per CPU is involved in diagonalization.
Further performance enhancements will be possible using
OpenMP BLAS inside regter/cegter/rdiaghg/cdiaghg (to be implemented)
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4153 c92efa57-630b-4861-b058-cf58834340f0
data. Matrixes are distributed across processors.
- to turn on the use of the new algorithm, a new value for
input parameter "diagonalization", has been introduced:
diagonalization = 'david+distpara'
work like david but use fully distributed memory
iteration loop. Allocated memory scale down with the
number of procs. Procs involved in diagonalization can
be changed with input parameter "ortho_para".
On multi core/CPUs often it is convenient to let only
one core per CPU to work on linear algebra.
User can tune the number of core involved in diag. with
the keyword in electrons namelist:
ortho_para = NN
then the code will use the largest square smaller than NN,
since matrixes are always distributed on a suqre grid of procs.
Note that if NN < 2*nproc, then one proc every two is taken
for parellel diag. The same for NN < 4*proc, one every four
is taken. This is to minimize memory contention on multi core proc.
In example, if you run with 64procs on a 4core CPU cluster,
it may be convenient to specify:
ortho_para = 16
So that only one core per CPU is involved in diagonalization.
Further performance enhancements will be possible using
OpenMP BLAS inside regter/cegter/rdiaghg/cdiaghg (to be implemented)
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4152 c92efa57-630b-4861-b058-cf58834340f0