band parallelization inside h_psi and s_psi. Variable "use_bgrp_in_hpsi"
should be actually moved in a more appropriate module, since it is related to
how h_psi behaves, not strictly to the parallelization over bands
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@12052 c92efa57-630b-4861-b058-cf58834340f0
with PW. There are several other variables that can be merged now, but these
three were easier and required changes to a very small number of routines.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11983 c92efa57-630b-4861-b058-cf58834340f0
for a serial executable, (a > b) = .F. for another serial executable
on the same machine and compiler, differing only for the FFT used ...
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11939 c92efa57-630b-4861-b058-cf58834340f0
one for B3LYP. This caused discrepancies up to of a few tenths of eV
in Kohn-Sham energies with respect to the "true" B3LYP. VWN is used
to define the LDA correlation. B3LYP-V1R (B3LYP using VWN_1_RPA instead)
has also been added.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11913 c92efa57-630b-4861-b058-cf58834340f0
in the current run, not in all runs if restarting from an incomplete run.
Fixed for bfgs, plus some comments added, minor simplification of the logic.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11866 c92efa57-630b-4861-b058-cf58834340f0
introducing ortho_parent_comm to be used when addressing the whole group.
linear algebra is now distributed (in PW) inside the pool group (in CPV is left unchanged... are there pools in CPV?).
mp_global sets ortho_comm as a sub-communicator of intra_pool_comm (used to be intra_bgrp_comm). It can be reverted
to previous choice by commenting/uncommenting one line
tested on PW/example02 co.rx.in case (both Gamma and K=(000)) with
-np 8 -nd 4 -nb 2
that is using 2 bgrp (procs 0123 and 4567) and diagonalizing on 4 procs (0246).
tested also on
-np 4 -nd 4 -nb 2
that is using 2 bgrp (procs 01 and 34) and diagonalizing on 4 procs (0123).
some bgrp parallelization added to a few routines. global variables (evc,..) are NOT distributed but some local ones
are and more could be done.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11863 c92efa57-630b-4861-b058-cf58834340f0
more efficient.
subroutine init_index_over_band ( comm, nbnd ) that set ibnd_start and ibnd_end
variables requiring comm=inter_bgrp_comm is removed and replaced by
subroutine set_bgrp_indices ( nbnd, ibnd_start, ibnd_end ) implementing the same
relationships between its arguments but:
- forcing the use of inter_bgrp_comm from the same mp_bands module,
- returning ibnd_start and ibnd_end as explicit outputs that are not anymore kept
in the module. In this way other quantities can be distributes if needed in any
given routine without too many non-local effects.
For compatibility with TDDFPT, that uses the bgrp parallelization and loads
ibnd_start/ibnd_end trhough mp_global module, these two variables are moved in
a dedicated module mp_bands_TDDFPT included in Module/mp_bands.f90. This is done
to avoid too much invasive changes in a code i don't know well. In this way the
needed changes are very localized and transparent, the code compiles correctly
so I think it should work exactly as before.
In my opinion the two variables should be moved somewhere inside TDDFPT.
Band parallelization is extended to h_psi(lda,n,m,psi,hpsi) and s_psi routines
(only when .not.exx_is_active because otherwise it is already used inside vexx)
for generic values of m (of course it gives a speedup only when m is not too small
compared to nbgrp but it works also if m < nbgrp ).
Compatibility with task groups has not be explored but should not be conceptually
different from how it works in the exx case.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11835 c92efa57-630b-4861-b058-cf58834340f0
"erroneous arithmetic operation". Added a check preventing the calculation
of exp(-x) for x>40, since 1+exp(-x) is not distinguishable from 1 anyway
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11832 c92efa57-630b-4861-b058-cf58834340f0
option MPI_THREAD_MULTIPLE, while MPI_THREAD_FUNNELED seems to work ok.
Since the latter case is all we need, I think it is safer to stick to it
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11797 c92efa57-630b-4861-b058-cf58834340f0
as well): variable "input_xml_schema_file" is by default set to an empty
string; if not read in input by read_namelist, it is read from environment
variable QEXML. Far from perfect, maybe "input_xml_schema_file" should be
written to and read from xml data file?
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11776 c92efa57-630b-4861-b058-cf58834340f0
command_argument_count, flush, are used everywhere instead of wrappers.
Some old versions of compilers may no longer work.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11759 c92efa57-630b-4861-b058-cf58834340f0
but I suspect that the problem arises only for points of the FFT array beyond
the physical range, where the gradient of rho is exactly zero
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@11721 c92efa57-630b-4861-b058-cf58834340f0