Commit Graph

313 Commits

Author SHA1 Message Date
Oscar Baseggio (obaseggi) e7a95188b5 fix device_helper
clean lr_apply_liouvillian
2022-06-29 12:06:48 +02:00
Laura Bellentani 3f92ea72ba loops over bands in cgsolve offloaded to gpus with cublasddot_v2 2022-06-29 10:07:01 +02:00
Laura Bellentani 95ee7dafbc loop3, loop2, loop4 ported 2022-06-29 10:06:45 +02:00
Laura Bellentani 30f1098d58 'ddot under development' 2022-06-29 10:06:42 +02:00
Laura Bellentani f488b24562 cgsolve partially ported, ddot result on device still missing (wip) 2022-06-29 10:06:15 +02:00
Ivan Carnimeo 0c65a0bb46 some changes in myddot_vector_gpu 2022-06-21 15:12:14 +02:00
Ivan Carnimeo 7e3224e180 ParO to OpenACC: fix cmake compilation 2022-06-01 17:57:29 +02:00
Ivan Carnimeo 8628c5cd17 bpcg_k form CUF to OpenACC
- bpcg_k_gpu removed
2022-05-30 19:31:52 +02:00
Ivan Carnimeo 0546abf585 Fix cmake compilation 2022-05-06 15:11:58 +02:00
Ivan Carnimeo 31d427bb87 This might be a compact way to perform mp_sum
passing non-contiguous device arrays
2022-05-05 15:38:57 +02:00
Ivan Carnimeo ef9f59d4d0 cegterg to OpenACC (12):
- small fixes for CPU version (this fixes the WARNING at commit (11))
2022-04-28 21:45:21 +02:00
Ivan Carnimeo 9d9b0b9acb cegterg to OpenACC (11):
- cegterg_gpu merged with cegterg
	- cegterg_gpu bypassed, only cegterg is used for both CPU and GPU
WARNING: this commit has been tested only on GPU and needs to be tested also on CPU
2022-04-28 21:16:48 +02:00
Ivan Carnimeo d684c50deb cegterg to OpenACC (1):
psi_d, hpsi_d, spsi_d --> psi, hpsi, spsi
	MYDDOT (host) --> MYDDOT_VECTOR_GPU (device)
2022-04-22 16:24:59 +02:00
Sergio Orlandini cb77cb885a fix nvtx markers in start_clock_gpu 2022-04-06 10:15:51 +02:00
Paolo Giannozzi 93c29fab7f CUDA MPI problem
This does not solve issue 478 but it seems to me clearly wrong
2022-04-01 08:57:27 +02:00
Edan Bainglass 068d2d5b93 Added missing `TRIM` to `calling_routine` and `message` in `errore` 2022-03-04 09:48:56 -06:00
Riccardo Bertossa 94cc1f61d6 [CPV] bulk of the CG code ported to gpu.
NOT WORKING: restart from cg to verlet
  NOT ACCELERATED: ultrasoft, ensemble dft

added useful debug/test module Modules/debug.f90
it works like the following: you put a call on the sub
CHECKPOINT(your_array/scalar_to_be_checked)

then, if in the input file, in the control namelist, the flag debug_checkpoints is .true., the routine does something.
If the flag debug_checkpoint_testing == .false., the data is written on the filesystem each time in a new file that is called
debug_checkpoint_file_prefix (input variables in the control namelist that defaults to 'debug_checkpoint'). Otherwise the data is read
from the filesystem and some useful information is displayed in the stdout, like the difference and the magnitude of the vectors.
This debug routine WORKS ONLY with a SINGLE MPI process
2021-11-20 20:46:25 +01:00
Pietro Delugas 33f0749d7e gram openACCelerated 2021-11-20 20:46:25 +01:00
Paolo Giannozzi 688f87db26 Another obsolete gfortran hack 2021-10-29 21:28:37 +02:00
Paolo Giannozzi 5d74f827c4 Remove make.depend files from git 2021-10-24 21:29:29 +02:00
Paolo Giannozzi 1c1cf38215 Fix compilation with __MPI_MODULE 2021-10-23 15:12:38 +02:00
giannozz 6b9d6677b5 [skip-CI] CP user guide updated 2021-09-02 18:44:12 +00:00
giannozz fdd2d391b8 clib/ is no more 2021-08-29 18:08:27 +00:00
Elena De Paoli 57760bec90 Adding RMM-DIIS cpu version 2021-08-12 20:08:31 +02:00
Ye Luo 3e0102c194 -cuda is needed for NVTX lib. 2021-07-08 00:53:05 -05:00
Ye Luo 06e5783b72 Fix GNU warning. 2021-06-27 13:28:11 -05:00
Ye Luo 3799423cd1 Add qe_device_lapack target for device helpers. 2021-06-19 21:29:42 -05:00
Ye Luo a4c6bfaa99 Move Modules/wrappers.f90 to UtilXlib/clib_wrappers.f90 2021-06-19 19:44:23 -05:00
Ye Luo 69fbdcb0b1 Add unit test runner. 2021-05-23 09:58:06 -05:00
Ye Luo 01a13df4b2 Relocate unit tests. 2021-05-02 14:43:11 -05:00
Daniele Cesarini 042a8ca1a0 CMake fixes for shared lib builds and added a CI build for NVHPC 2021-04-06 19:07:49 +00:00
Andrea Ferretti 9a69f149b9 more files related to PP moved from PW/src to upflib 2021-02-15 10:09:20 +01:00
giannozz 7840f39805 Merge branch 'fix_trace' into 'develop'
fix trace initialization separating the clock initialization from the setting of the max_detph of the trace

See merge request QEF/q-e!1301
2021-01-29 16:02:24 +00:00
giannozz 9bcb7ce91f Merge branch 'add-ctest-run-without-check' into 'develop'
[CMake]  keep both dependency logic in ctest

Closes #270

See merge request QEF/q-e!1299
2021-01-29 13:57:00 +00:00
Pietro Delugas 4615a9d00c adding independent routine for setting max_print_depth value 2021-01-29 09:30:36 +01:00
Ye Luo f24b4d583c Fix UtilXlib tests GPU build. 2021-01-28 19:26:29 -06:00
Pietro Delugas e36cb4cfa5 completin MR with changes suggested by reviewers 2021-01-27 00:19:34 +01:00
Pietro Delugas 747bbf0d76 adding the calls to nvidia profiler plugin in the main code
they plugin is compiled only whe the __NV_PROFILE preprocessor file is defined.
2021-01-25 15:35:37 +01:00
Ye Luo 7b94aa9b36 Merge branch 'fftxlib-error' into 'develop'
Fix error handling in FFTXlib

See merge request QEF/q-e!1291
2021-01-24 16:58:10 +00:00
Ye Luo 4d8d452630 Typo correction. 2021-01-24 10:24:27 -06:00
Paolo Giannozzi 229be57edb Script for dependencies and make.depend updated 2021-01-23 22:32:58 +01:00
Daniele Cesarini d462b73a6a Fixed cmake files for cpu-gpu merge 2021-01-23 12:42:52 +01:00
Pietro Delugas db0da8b0d9 Merge branch 'merge_qegpu' into HEAD 2021-01-22 17:20:50 +01:00
Pietro Bonfa 94b889c198 cuda_util is now part of devxlib 2021-01-22 10:59:28 +01:00
Pietro Bonfa 0c285826d2 Merge branch 'develop' into syncqe8 2021-01-21 19:27:49 +01:00
Daniele Cesarini 586f66aadf Introduce CUDA support in CMake with some refactoring. 2021-01-18 14:50:50 +00:00
Pietro Bonfa af81968cf1 Merge branch 'develop' into syncqe6 2020-12-05 19:25:50 +01:00
Pietro Bonfa 1139c04387 New utilXlib interface for ParO (courtesy of Ivan Carnimeo) 2020-12-03 11:17:03 +01:00
Pietro Bonfa 9529540d8d Merge branch 'develop' into syncqe5 2020-11-24 13:44:01 +01:00
Daniele Cesarini b2a4a6b89f Fixed shared library compilation 2020-11-23 12:16:32 +01:00
Pietro Bonfa 2da1f5ccf3 Forgot one file 2020-10-22 15:30:54 +02:00
Pietro Bonfa a6401f2097 Removed old copy/pasted version of devxlib from utilxlib 2020-10-22 15:28:40 +02:00
Pietro Bonfa 05c866e91f Merge branch 'develop' into syncqe2 2020-10-18 17:50:41 +02:00
Paolo Giannozzi 635ef1c9bf [skip-CI] executables should be executable 2020-10-15 21:17:38 +02:00
Pietro Bonfa 1a4df64ffe Merge branch 'develop' into syncqe 2020-10-04 16:33:16 +02:00
giannozz d0d0d8d721 Merge branch 'remove-MPI-module' into 'develop'
Remove use of MPI module

See merge request QEF/q-e!1127
2020-10-01 19:24:05 +00:00
Ye Luo 5ca1cfd343 Remove use of MPI module. 2020-09-30 23:22:37 -05:00
Ye Luo ccb30bc8db Fix unit tests. 2020-09-30 23:14:23 -05:00
Ye Luo 8d7b692508 Move executables and static archives to bin lib 2020-09-30 13:42:46 -05:00
Daniele Cesarini e736e1c01c Fixed missing dependencies to OpenMP 2020-09-29 18:11:32 +02:00
Daniele Cesarini fc09ef40e4 Removed cmake function preprocessing and replaced with _qe_add_global_target 2020-09-29 18:11:32 +02:00
Daniele Cesarini 90840d6caf Fix preprocessor flags for Fortran files 2020-09-29 18:11:32 +02:00
Daniele Cesarini 9246f191ac Restricted dependency visibility for cmake targets 2020-09-29 18:11:31 +02:00
Daniele Cesarini d912e3905c Added missing QE packagies to cmake 2020-09-29 18:11:31 +02:00
Federico Ficarelli 2adf2e3f44 Make qe_install_targets variadic 2020-09-29 18:11:30 +02:00
Federico Ficarelli 1a1bb304a5 Add missing sources to QE::UTILX 2020-09-29 18:11:30 +02:00
Federico Ficarelli 7598a46ff2 Add FFTXLib as QE::FFTX 2020-09-29 18:11:30 +02:00
Federico Ficarelli c713248d91 Make qe_add_library work with interface targets 2020-09-29 18:11:30 +02:00
Federico Ficarelli b4eb2fd490 Remove useless features target 2020-09-29 18:11:30 +02:00
Federico Ficarelli ba61edaebb Fix component names case 2020-09-29 18:11:30 +02:00
Federico Ficarelli 4593deec27 Add support for targets installation 2020-09-29 18:11:30 +02:00
Federico Ficarelli 9f58ebece4 Add CMake support for LAXLib 2020-09-29 18:11:29 +02:00
Federico Ficarelli ce7c15c3b0 Make qe_install_targets variadic 2020-09-29 18:11:29 +02:00
Federico Ficarelli 6f5e0fb95a Add missing sources to QE::UTILX 2020-09-29 18:11:29 +02:00
Federico Ficarelli 256bf99987 Add FFTXLib as QE::FFTX 2020-09-29 18:11:29 +02:00
Federico Ficarelli 241ad122e0 Make qe_add_library work with interface targets 2020-09-29 18:11:28 +02:00
Federico Ficarelli bf4c480389 Remove useless features target 2020-09-29 18:11:28 +02:00
Federico Ficarelli f01e1e21b0 Fix component names case 2020-09-29 18:11:28 +02:00
Federico Ficarelli 282558e285 Add support for targets installation 2020-09-29 18:11:28 +02:00
Federico Ficarelli e0673accd6 Add CMake support for LAXLib 2020-09-29 18:11:28 +02:00
Pietro Bonfa c65f32e8e7 Updated devXlib 2020-07-30 16:24:40 +02:00
Stefano de Gironcoli f4096c8dfd in a non blocking send it is not safe to assume that when the send
involves multiple chuncks these are sent in order... that is checking
the depature of the 'last' chunck only is not safe...
temporarily a hopefuly capacious array for storing all active send_requests
is allocated, if it is insufficient the code stops and complains.
a cleaner solution shuld be found.
2020-06-08 16:46:12 +02:00
Stefano de Gironcoli 45ab1cb566 add the possibility of a non-blocking send (mpi_isend) if a send_request
handle is added as optional argument (only for real and complex data types
but should be extendable if needed)
2020-06-08 16:46:12 +02:00
Stefano de Gironcoli 85aaff8f4a point-to-point operations mp_send and mp_recv added to mp.f90
NB1) only cpu version has been implemented;
NB2) no barrier in this call even when __USE_BARRIER, if needed add them outside.
2020-06-08 16:46:11 +02:00
Stefano de Gironcoli cc713f4331 removal of unused variable 2020-06-08 16:46:11 +02:00
Pietro Bonfa db871e6906 Merge branch 'cp-gpu-milestone1' into gpu-develop 2020-05-06 13:00:41 +02:00
Pietro Bonfa c4c2417226 Merge branch 'develop' into gpu-develop 2020-04-26 16:43:48 +02:00
Hyungjun Lee 5c120fb820 update mp_base.f90 2020-04-15 19:46:24 -05:00
carcava 73b7a967fa Merge branch 'gpu-develop' into cp-gpu-milestone1
Conflicts:
	CPV/src/init.f90
	LAXlib/la_helper.f90
2020-04-16 00:28:10 +02:00
Pietro Bonfa 56c3090769 Merge commit '3c87bac5e67b5e30b9c5d7e7d3a69f9fb4285e1b' into gpu-develop 2020-04-11 19:05:36 +02:00
Pietro Bonfa 91dc1c504a Merge commit '747995108ec2c873a483a3534d11ac44901ebdd0' into gpu-develop 2020-04-05 11:22:45 +02:00
Pietro 968d4df406 GPU timers 2020-03-21 17:41:10 +00:00
Pietro Bonfa fd0905cdb7 Replaced custom calls with devxlib 2020-03-14 13:22:31 +01:00
carcava 38e4fc31b5 Merge remote-tracking branch 'origin' into cp-gpu 2020-03-13 17:23:53 +01:00
carcava e25d65a139 - fix for CPU only compilation 2020-03-11 09:43:06 +01:00
Pietro Bonfa 3fd21fdad3 Always check if clock is running [skip ci] 2020-03-10 22:05:39 +01:00
carcava 657a7d2e12 more device variables, more helper subs, bug fix for OpenMP with intel compiler 2020-03-10 01:55:17 +01:00
carcava 8fe16a7bfe adding device_helper.f90 to collect miscellaneous helper subroutines 2020-03-09 18:15:34 +01:00
Pietro Bonfa 74b0ff4e77 Better and more accurate timing 2020-03-07 15:24:46 +01:00
carcava 3b8224629f Merge branch 'gpu-develop' into cp-gpu 2020-03-06 19:24:55 +01:00