array arguments of the mother sub. with explicit dimensions)
- task groups implemented for k points too.
- task groups implemented also in the loop over bands contained in sum_band
- task groups NOT YET implemented for non collinear spin and meta dft
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4587 c92efa57-630b-4861-b058-cf58834340f0
no global replicated matrix are now allocated inside cdiaghg
- real routine will follow soon
- note that the number of processors involved in diag.
is the largest square smaller or equal to nproc_pool
- it is possible to suggest a different number of processors in
the input with the parameter: ortho_para (like for cp)
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@4104 c92efa57-630b-4861-b058-cf58834340f0
for Cholesky decomposition. These routines are used to parallelize the solvers
for the generalized eigenvalue problems, namely rdiaghg and cdiaghg (notice that
the inversion of the lower triangular matrix L is still done using a serial
lapack routine). These two routines are now used for Davidson parallelization.
Old algorithm based on the orthogonalization of the correction vectors has been
removed (it was awfully slow). The performance of the new algorithm should be
decent. Beware unexpected side effects. C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@3310 c92efa57-630b-4861-b058-cf58834340f0
checks whether or not it is convenient to use the parallel diagonalizer reporting the result.
The output is still verbose to facilitate the identification of bugs.
C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@2767 c92efa57-630b-4861-b058-cf58834340f0
- Changed format when writing a copy of the UPF file: for some reason
the free format adds an initial blank character, causing the program
to fail when reading (in fixed format) additional info for spin-orbit.
Format A is now used when writing, free format when reading. (AdC)
- More preprocessing cleanup and documentation: anybody having access
to weird machines is kindly requested to verify if things work
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@2732 c92efa57-630b-4861-b058-cf58834340f0
conversion to real => DBLE
(including real part of a complex number)
conversion to complex => CMPLX
complex conjugate => CONJG
imaginary part => AIMAG
All functions are uppercase.
CMPLX is preprocessed by f_defs.h and performs an explicit cast:
#define CMPLX(a,b) cmplx(a,b,kind=DP)
This implies that 1) f_defs.h must be included whenever a CMPLX is present,
2) CMPLX should stay in a single line, 3) DP must be defined.
All occurrences of real, float, dreal, dfloat, dconjg, dimag, dcmplx
removed - please do not reintroduce any of them.
Tested only with ifc7 and g95 - beware unintended side effects
Maybe not the best solution (explicit casts everywhere would be better)
but it can be easily changed with a script if the need arises.
The following code might be used to test for possible trouble:
program test_intrinsic
implicit none
integer, parameter :: dp = selected_real_kind(14,200)
real (kind=dp) :: a = 0.123456789012345_dp
real (kind=dp) :: b = 0.987654321098765_dp
complex (kind=dp) :: c = ( 0.123456789012345_dp, 0.987654321098765_dp)
print *, ' A = ', a
print *, ' DBLE(A)= ', DBLE(a)
print *, ' C = ', c
print *, 'CONJG(C)= ', CONJG(c)
print *, 'DBLE(c),AIMAG(C) = ', DBLE(c), AIMAG(c)
print *, 'CMPLX(A,B,kind=dp)= ', CMPLX( a, b, kind=dp)
end program test_intrinsic
Note that CMPLX and REAL without a cast yield single precision numbers on
ifc7 and g95 !!!
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@2133 c92efa57-630b-4861-b058-cf58834340f0
1) f_defs.h for definitions to be included in FORTRAN files ONLY
2) c_defs.h for definitions to be included in C files ONLY
C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@1012 c92efa57-630b-4861-b058-cf58834340f0
This required a deep modification of the parallelism in PWscf:
there are two new communicators (intra_image_comm and inter_image_comm) and the
existing "pool" communicators (intra_pool_comm and inter_pool_comm) are now vectors
of length given by the number of parallel images. #ifdef __PARA is no longer
needed because all "parallel" variables are always initialized for a serial run
and all parallel routines are, in the case of a serial run, dummy routines.
The wrappers to MPI routines used only by PWscf are in the PW/para.f90 file.
The others (mp_***) are in the Modules/mp.f90. All explicit referencies to mpif.h
should be replaced by an "USE parallel_include" (in a serial run parallel_include
is simply a dummy module).
2) The extrapolation of both potential and wavefunctions has been rewritten in
order to be smarter than before: on the basis of the required extrapolation
order, on the basis of the history and on the basis of which files are really
present on the disk, the algorithm chooses the extrapolation order.
All the algorithms in which ions are moved can use the extrapolation.
These are both unstable features: I need the help of everybody to test them.
C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@742 c92efa57-630b-4861-b058-cf58834340f0
All kinds ionic dynamics are done by a single cpu (see move_ions.f90).
After the ions are moved the new positions (and other information) are
broadcasted to all other cpus.
C.S.
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@701 c92efa57-630b-4861-b058-cf58834340f0
correct place (kinds); module "varie" replaced by "control_flags" (not
yet in pwcom, though) - many many files changed.
64-bit cpus (Opteron, maybe Itanium) should now work if __LINUX64 is defined
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@513 c92efa57-630b-4861-b058-cf58834340f0
Make.rules_cpp => Rules.cpp, Make.rules_nocpp => Rules.nocpp
lapack_mkl.f added, __MKL removed
Make.{fujutsu,sxcross}, compile error in restart.f90 (Guido)
electrons, punch_band, plot_bands: use the same format
for reading and writing eigenvalues
git-svn-id: http://qeforge.qe-forge.org/svn/q-e/trunk/espresso@119 c92efa57-630b-4861-b058-cf58834340f0