Samuel Ponce
df83585b3a
Change in post-processing superconductivity
...
The distance with respect to the Fermi level was not taken into
account when writing on files the superconducting gaps on the Fermi surface.
Issue raised by Miao Gao and solved by R. Margine and S. Ponce.
2019-04-02 23:53:48 +01:00
Paolo Giannozzi
6834a502ef
[Skip-CI] Obsolete version 'svn' replaced by 'git'; various .PHONY of questionable
...
usefulness, referring to no longer existing procedure devised for svn, removed
2019-03-01 17:42:56 +01:00
Pietro Delugas
3b20b0ff9f
modified xml printout for vdw element
...
* fixed reporting for xdm cases
* added Grimme-d3 case
2019-02-12 18:07:17 +00:00
Pietro
dfd95eb378
Remove or reduce MPI message splitting in CUDA Fortran implementation of UtilXlib
2019-02-11 09:49:52 +00:00
Paolo Giannozzi
5a6a4417dd
Also LAXlib and UtilXlib lib should make better usage of parallel_include
2019-02-04 10:25:07 +01:00
Pietro Bonfa
c4d0af26dd
Added optional argument to deallocate UtilXlib buffers and checks to avoid double allocations on mp_start.
2018-12-17 13:30:04 +01:00
Pietro Bonfa
1cfc3dad84
Forgot actual mp_bcast_c6d_gpu test.
2018-12-11 19:01:27 +01:00
Pietro Bonfa
fae8e4a424
Added tests for mp_bcast_c6d_gpu and mp_sum_r6d_gpu
2018-12-11 18:56:31 +01:00
Pietro Bonfa
4961c03ff2
Merged QEF/develop and added GPU subroutines for mp_bcast_c6d and mp_sum_r6d
2018-12-11 18:55:17 +01:00
Pietro Bonfa
448c98fc6d
Preprocessor directives aligned correctly. Arrays that gets updated now have inout attribute.
2018-11-12 18:38:16 +01:00
Pietro Bonfa
f6ea23b0a3
New cuda_util
2018-11-09 19:44:28 +01:00
Pietro Bonfa
5a267077ae
Avoid useless copies also in mp_allgatherv_inplace_cplx_array_gpu and mp_gatherv_inplace_cplx_array_gpu
2018-11-09 19:36:06 +01:00
Pietro Bonfa
f71cdbfcdb
A few cuf kernel based helper subroutines for the CUDA port
2018-11-08 11:29:39 +01:00
Pietro Bonfa
5724f251eb
Avoid some unnecessary data copies on collective communications among groups with one process
2018-11-07 10:52:20 +01:00
Iurii Timrov
52a67b19a5
1) Implementation of the PHonon+U code (A. Floris, S. de Gironcoli, E.K.U. Gross,
...
I. Timrov, B. Himmetoglu, N. Marzari, M. Cococcioni). The code was ported
from QE 5.0.2 to the latest version of QE, by I. Timrov with the help of
A. Floris and M. Cococcioni. Many thanks for the discussions with P. Giannozzi,
P. Delugas, A. Dal Corso, M. Calandra, L. Paulatto about various issues
during the porting. Sorry if I forgot to mention someone.
2) Some small modifications in the HP code in order to be consistent
with the porting of PHonon+U and changes in LR_Modules.
2018-10-30 16:20:32 +01:00
Pietro Bonfa
8fa8008e26
Added missing dependency
2018-10-09 11:35:23 +02:00
Pietro Bonfa
6112f56d5d
Added information regarding MPI interfaces
2018-09-28 16:01:30 +02:00
Pietro Bonfa
1f250ee0f8
Added warning about synchronization behaviour
2018-09-19 14:37:51 +02:00
Pietro Bonfa
9ca6912a7d
Added correct handling of CUDA streams synchronization. New behaviour detailed in source file and in README.md
2018-09-18 18:34:12 +02:00
Pietro Bonfa
ace7570da1
Added README file
2018-08-16 14:17:17 +02:00
Stefano de Gironcoli
86920021c8
revert thread_util to version w/o data chuncking
2018-08-07 20:45:56 +02:00
Pietro Bonfa
5dd5569187
Merge branch 'develop' into mpicuda
2018-08-07 16:18:31 +02:00
Stefano de Gironcoli
ac8b63bd4c
update of previous merge PPCG
2018-08-05 08:25:56 +00:00
Daniel Pinkal
29164dbf9a
Update clocks_handler.f90
...
Fixed broken clock output for WALL times > 1h and harmonized spacing of d-h-m-s time output.
For the WALL time output in the case of mhour>0 there was given "nmin" as first variable for the output string which was wrongly placed there.
Additionally, the spacing of all d-h-m-s outputs where changed in a way, that the Words "CPU" and "WALL" are all in one row with the output in seconds only [if defined(__CLOCK_SECONDS)].
2018-07-24 13:49:02 +00:00
Pietro Bonfa
119fd5a924
Added unit tests for new gather functionalities
2018-07-05 17:26:42 +02:00
Pietro Bonfa
5fb0f3c45c
Added GPU version of mp_type_create_cplx_column_section and corrected interface for mp_gather
2018-07-05 15:15:35 +02:00
paoloumari
f1409c1daf
Updated using makedep.sh
2018-06-30 10:18:43 +02:00
Paolo Giannozzi
7a426ad216
Release-notes updated, obsolete script for release deleted.
...
Slightly improved printout of wall times (format depends upon wall time,
not cpu time as now: no difference except for some cases with OpenMP)
2018-06-20 18:37:23 +02:00
Stefano de Gironcoli
7b84c0114e
undo the rescaling of the cpu time per thread.
...
it can be restored by uncommenting one line in UtilXlib/clocks_handler.f90
search for PRINT_AVG_CPU_TIME_PER_THREAD in that file
2018-06-20 09:32:07 +02:00
Stefano de Gironcoli
8b3eb1752d
cpu time rescaling is made only optional with a pre-processing option
2018-06-20 09:32:07 +02:00
Stefano De Gironcoli
87400ecfd3
cpu time per mpi task scaled dow by the number of omp processes participating into it
2018-06-20 09:32:07 +02:00
Pietro Bonfa
b138880ebc
Test mp_max_rv_buffer today showed two typos in parallel_(max|min)_real_gpu
2018-06-18 11:07:49 +02:00
Pietro Bonfa
fa3653b6e3
Test was not doing what it's supposed to do.
2018-06-18 11:05:29 +02:00
Pietro Bonfa
83e08840e7
Merged changes in develop adding new utilxlib mpi interfaces. Cuda equivalent functions added. Tests for new subroutines will be added in a separate commit.
2018-06-18 11:03:25 +02:00
Ye Luo
6ac7f8c32a
Merge branch 'bugfix-ndiag' into opt-threading-all-parts
2018-06-14 19:05:31 -05:00
Paolo Giannozzi
2c4f0dbdac
More realistic memory estimate for EXX calculations.
...
Minor improvements to memory counter.
2018-06-08 08:27:01 +02:00
Ye Luo
8812c4085f
Reverted to the old algorithm in hpsi_dot_v.
2018-06-02 16:24:36 -05:00
Ye Luo
fa21b8d52a
Add functions to do threaded memcpy and memset
...
threaded_memXXX is contains a parallel do region
threaded_barrier_memXXX contains do region without parallel
threaded_nowait_memXXX contains do region without parallel and a nowait at the end do
2018-06-02 12:22:42 -05:00
Ye Luo
c2b2efdaed
Use more sensible names.
2018-05-31 12:29:41 -05:00
Ye Luo
14508b0810
Optimize hpsi_dot_v
2018-05-28 19:13:39 -05:00
Ye Luo
af2fac5ef9
Replace allgather with gather.
2018-05-28 08:34:42 -05:00
Ye Luo
2c6c859896
Remove all unnecessary mem ops in cegterg.
2018-05-27 21:54:46 -05:00
Pietro Bonfa
47bf66c777
Random seed was not set when using /dev/urandom
2018-03-16 16:38:15 +01:00
Pietro Bonfa
c376f8e240
More tests for mp_bcast and mp_sum
2018-03-06 19:09:52 +01:00
Pietro Bonfa
59a0749fc8
Better fix for issue #16 .
2018-03-06 18:03:46 +01:00
Pietro Bonfa
971d9de467
Real random numbers used now, paranoid data reset before tests
2018-03-06 11:42:00 +01:00
Pietro Bonfa
ec93bfa319
Use DP defined in utilXlib everywhere
2018-02-28 11:21:12 +01:00
Pietro Bonfa
c3f4234a22
Some tests have long file names
2018-02-27 23:02:17 +01:00
Pietro Bonfa
92ecd0bbad
Use DP for complex
2018-02-27 22:01:28 +01:00
Pietro Bonfa
cc71b45e76
Added test for mp_circular_shift_left (GPU code only)
2018-02-27 13:54:17 +01:00