Commit Graph

115 Commits

Author SHA1 Message Date
Samuel Ponce df83585b3a Change in post-processing superconductivity
The distance with respect to the Fermi level was not taken into
account when writing on files the superconducting gaps on the Fermi surface.

Issue raised by Miao Gao and solved by R. Margine and S. Ponce.
2019-04-02 23:53:48 +01:00
Paolo Giannozzi 6834a502ef [Skip-CI] Obsolete version 'svn' replaced by 'git'; various .PHONY of questionable
usefulness, referring to no longer existing procedure devised for svn, removed
2019-03-01 17:42:56 +01:00
Pietro Delugas 3b20b0ff9f modified xml printout for vdw element
* fixed reporting for xdm cases
* added Grimme-d3 case
2019-02-12 18:07:17 +00:00
Pietro dfd95eb378 Remove or reduce MPI message splitting in CUDA Fortran implementation of UtilXlib 2019-02-11 09:49:52 +00:00
Paolo Giannozzi 5a6a4417dd Also LAXlib and UtilXlib lib should make better usage of parallel_include 2019-02-04 10:25:07 +01:00
Pietro Bonfa c4d0af26dd Added optional argument to deallocate UtilXlib buffers and checks to avoid double allocations on mp_start. 2018-12-17 13:30:04 +01:00
Pietro Bonfa 1cfc3dad84 Forgot actual mp_bcast_c6d_gpu test. 2018-12-11 19:01:27 +01:00
Pietro Bonfa fae8e4a424 Added tests for mp_bcast_c6d_gpu and mp_sum_r6d_gpu 2018-12-11 18:56:31 +01:00
Pietro Bonfa 4961c03ff2 Merged QEF/develop and added GPU subroutines for mp_bcast_c6d and mp_sum_r6d 2018-12-11 18:55:17 +01:00
Pietro Bonfa 448c98fc6d Preprocessor directives aligned correctly. Arrays that gets updated now have inout attribute. 2018-11-12 18:38:16 +01:00
Pietro Bonfa f6ea23b0a3 New cuda_util 2018-11-09 19:44:28 +01:00
Pietro Bonfa 5a267077ae Avoid useless copies also in mp_allgatherv_inplace_cplx_array_gpu and mp_gatherv_inplace_cplx_array_gpu 2018-11-09 19:36:06 +01:00
Pietro Bonfa f71cdbfcdb A few cuf kernel based helper subroutines for the CUDA port 2018-11-08 11:29:39 +01:00
Pietro Bonfa 5724f251eb Avoid some unnecessary data copies on collective communications among groups with one process 2018-11-07 10:52:20 +01:00
Iurii Timrov 52a67b19a5 1) Implementation of the PHonon+U code (A. Floris, S. de Gironcoli, E.K.U. Gross,
I. Timrov, B. Himmetoglu, N. Marzari, M. Cococcioni). The code was ported
from QE 5.0.2 to the latest version of QE, by I. Timrov with the help of
A. Floris and M. Cococcioni. Many thanks for the discussions with P. Giannozzi,
P. Delugas, A. Dal Corso, M. Calandra, L. Paulatto about various issues
during the porting. Sorry if I forgot to mention someone.
2) Some small modifications in the HP code in order to be consistent
with the porting of PHonon+U and changes in LR_Modules.
2018-10-30 16:20:32 +01:00
Pietro Bonfa 8fa8008e26 Added missing dependency 2018-10-09 11:35:23 +02:00
Pietro Bonfa 6112f56d5d Added information regarding MPI interfaces 2018-09-28 16:01:30 +02:00
Pietro Bonfa 1f250ee0f8 Added warning about synchronization behaviour 2018-09-19 14:37:51 +02:00
Pietro Bonfa 9ca6912a7d Added correct handling of CUDA streams synchronization. New behaviour detailed in source file and in README.md 2018-09-18 18:34:12 +02:00
Pietro Bonfa ace7570da1 Added README file 2018-08-16 14:17:17 +02:00
Stefano de Gironcoli 86920021c8 revert thread_util to version w/o data chuncking 2018-08-07 20:45:56 +02:00
Pietro Bonfa 5dd5569187 Merge branch 'develop' into mpicuda 2018-08-07 16:18:31 +02:00
Stefano de Gironcoli ac8b63bd4c update of previous merge PPCG 2018-08-05 08:25:56 +00:00
Daniel Pinkal 29164dbf9a Update clocks_handler.f90
Fixed broken clock output for WALL times > 1h and harmonized spacing of d-h-m-s time output.
For the WALL time output in the case of mhour>0 there was given "nmin" as first variable for the output string which was wrongly placed there.
Additionally, the spacing of all d-h-m-s outputs where changed in a way, that the Words "CPU" and "WALL" are all in one row with the output in seconds only [if defined(__CLOCK_SECONDS)].
2018-07-24 13:49:02 +00:00
Pietro Bonfa 119fd5a924 Added unit tests for new gather functionalities 2018-07-05 17:26:42 +02:00
Pietro Bonfa 5fb0f3c45c Added GPU version of mp_type_create_cplx_column_section and corrected interface for mp_gather 2018-07-05 15:15:35 +02:00
paoloumari f1409c1daf Updated using makedep.sh 2018-06-30 10:18:43 +02:00
Paolo Giannozzi 7a426ad216 Release-notes updated, obsolete script for release deleted.
Slightly improved printout of wall times (format depends upon wall time,
not cpu time as now: no difference except for some cases with OpenMP)
2018-06-20 18:37:23 +02:00
Stefano de Gironcoli 7b84c0114e undo the rescaling of the cpu time per thread.
it can be restored by uncommenting one line in UtilXlib/clocks_handler.f90
search for PRINT_AVG_CPU_TIME_PER_THREAD in that file
2018-06-20 09:32:07 +02:00
Stefano de Gironcoli 8b3eb1752d cpu time rescaling is made only optional with a pre-processing option 2018-06-20 09:32:07 +02:00
Stefano De Gironcoli 87400ecfd3 cpu time per mpi task scaled dow by the number of omp processes participating into it 2018-06-20 09:32:07 +02:00
Pietro Bonfa b138880ebc Test mp_max_rv_buffer today showed two typos in parallel_(max|min)_real_gpu 2018-06-18 11:07:49 +02:00
Pietro Bonfa fa3653b6e3 Test was not doing what it's supposed to do. 2018-06-18 11:05:29 +02:00
Pietro Bonfa 83e08840e7 Merged changes in develop adding new utilxlib mpi interfaces. Cuda equivalent functions added. Tests for new subroutines will be added in a separate commit. 2018-06-18 11:03:25 +02:00
Ye Luo 6ac7f8c32a Merge branch 'bugfix-ndiag' into opt-threading-all-parts 2018-06-14 19:05:31 -05:00
Paolo Giannozzi 2c4f0dbdac More realistic memory estimate for EXX calculations.
Minor improvements to memory counter.
2018-06-08 08:27:01 +02:00
Ye Luo 8812c4085f Reverted to the old algorithm in hpsi_dot_v. 2018-06-02 16:24:36 -05:00
Ye Luo fa21b8d52a Add functions to do threaded memcpy and memset
threaded_memXXX is contains a parallel do region
threaded_barrier_memXXX contains do region without parallel
threaded_nowait_memXXX contains do region without parallel and a nowait at the end do
2018-06-02 12:22:42 -05:00
Ye Luo c2b2efdaed Use more sensible names. 2018-05-31 12:29:41 -05:00
Ye Luo 14508b0810 Optimize hpsi_dot_v 2018-05-28 19:13:39 -05:00
Ye Luo af2fac5ef9 Replace allgather with gather. 2018-05-28 08:34:42 -05:00
Ye Luo 2c6c859896 Remove all unnecessary mem ops in cegterg. 2018-05-27 21:54:46 -05:00
Pietro Bonfa 47bf66c777 Random seed was not set when using /dev/urandom 2018-03-16 16:38:15 +01:00
Pietro Bonfa c376f8e240 More tests for mp_bcast and mp_sum 2018-03-06 19:09:52 +01:00
Pietro Bonfa 59a0749fc8 Better fix for issue #16. 2018-03-06 18:03:46 +01:00
Pietro Bonfa 971d9de467 Real random numbers used now, paranoid data reset before tests 2018-03-06 11:42:00 +01:00
Pietro Bonfa ec93bfa319 Use DP defined in utilXlib everywhere 2018-02-28 11:21:12 +01:00
Pietro Bonfa c3f4234a22 Some tests have long file names 2018-02-27 23:02:17 +01:00
Pietro Bonfa 92ecd0bbad Use DP for complex 2018-02-27 22:01:28 +01:00
Pietro Bonfa cc71b45e76 Added test for mp_circular_shift_left (GPU code only) 2018-02-27 13:54:17 +01:00