Oscar Baseggio (obaseggi)
e7a95188b5
fix device_helper
...
clean lr_apply_liouvillian
2022-06-29 12:06:48 +02:00
Laura Bellentani
3f92ea72ba
loops over bands in cgsolve offloaded to gpus with cublasddot_v2
2022-06-29 10:07:01 +02:00
Laura Bellentani
95ee7dafbc
loop3, loop2, loop4 ported
2022-06-29 10:06:45 +02:00
Laura Bellentani
30f1098d58
'ddot under development'
2022-06-29 10:06:42 +02:00
Laura Bellentani
f488b24562
cgsolve partially ported, ddot result on device still missing (wip)
2022-06-29 10:06:15 +02:00
Ivan Carnimeo
0c65a0bb46
some changes in myddot_vector_gpu
2022-06-21 15:12:14 +02:00
Ivan Carnimeo
7e3224e180
ParO to OpenACC: fix cmake compilation
2022-06-01 17:57:29 +02:00
Ivan Carnimeo
8628c5cd17
bpcg_k form CUF to OpenACC
...
- bpcg_k_gpu removed
2022-05-30 19:31:52 +02:00
Ivan Carnimeo
0546abf585
Fix cmake compilation
2022-05-06 15:11:58 +02:00
Ivan Carnimeo
31d427bb87
This might be a compact way to perform mp_sum
...
passing non-contiguous device arrays
2022-05-05 15:38:57 +02:00
Ivan Carnimeo
ef9f59d4d0
cegterg to OpenACC (12):
...
- small fixes for CPU version (this fixes the WARNING at commit (11))
2022-04-28 21:45:21 +02:00
Ivan Carnimeo
9d9b0b9acb
cegterg to OpenACC (11):
...
- cegterg_gpu merged with cegterg
- cegterg_gpu bypassed, only cegterg is used for both CPU and GPU
WARNING: this commit has been tested only on GPU and needs to be tested also on CPU
2022-04-28 21:16:48 +02:00
Ivan Carnimeo
d684c50deb
cegterg to OpenACC (1):
...
psi_d, hpsi_d, spsi_d --> psi, hpsi, spsi
MYDDOT (host) --> MYDDOT_VECTOR_GPU (device)
2022-04-22 16:24:59 +02:00
Sergio Orlandini
cb77cb885a
fix nvtx markers in start_clock_gpu
2022-04-06 10:15:51 +02:00
Paolo Giannozzi
93c29fab7f
CUDA MPI problem
...
This does not solve issue 478 but it seems to me clearly wrong
2022-04-01 08:57:27 +02:00
Edan Bainglass
068d2d5b93
Added missing `TRIM` to `calling_routine` and `message` in `errore`
2022-03-04 09:48:56 -06:00
Riccardo Bertossa
94cc1f61d6
[CPV] bulk of the CG code ported to gpu.
...
NOT WORKING: restart from cg to verlet
NOT ACCELERATED: ultrasoft, ensemble dft
added useful debug/test module Modules/debug.f90
it works like the following: you put a call on the sub
CHECKPOINT(your_array/scalar_to_be_checked)
then, if in the input file, in the control namelist, the flag debug_checkpoints is .true., the routine does something.
If the flag debug_checkpoint_testing == .false., the data is written on the filesystem each time in a new file that is called
debug_checkpoint_file_prefix (input variables in the control namelist that defaults to 'debug_checkpoint'). Otherwise the data is read
from the filesystem and some useful information is displayed in the stdout, like the difference and the magnitude of the vectors.
This debug routine WORKS ONLY with a SINGLE MPI process
2021-11-20 20:46:25 +01:00
Pietro Delugas
33f0749d7e
gram openACCelerated
2021-11-20 20:46:25 +01:00
Paolo Giannozzi
688f87db26
Another obsolete gfortran hack
2021-10-29 21:28:37 +02:00
Paolo Giannozzi
5d74f827c4
Remove make.depend files from git
2021-10-24 21:29:29 +02:00
Paolo Giannozzi
1c1cf38215
Fix compilation with __MPI_MODULE
2021-10-23 15:12:38 +02:00
giannozz
6b9d6677b5
[skip-CI] CP user guide updated
2021-09-02 18:44:12 +00:00
giannozz
fdd2d391b8
clib/ is no more
2021-08-29 18:08:27 +00:00
Elena De Paoli
57760bec90
Adding RMM-DIIS cpu version
2021-08-12 20:08:31 +02:00
Ye Luo
3e0102c194
-cuda is needed for NVTX lib.
2021-07-08 00:53:05 -05:00
Ye Luo
06e5783b72
Fix GNU warning.
2021-06-27 13:28:11 -05:00
Ye Luo
3799423cd1
Add qe_device_lapack target for device helpers.
2021-06-19 21:29:42 -05:00
Ye Luo
a4c6bfaa99
Move Modules/wrappers.f90 to UtilXlib/clib_wrappers.f90
2021-06-19 19:44:23 -05:00
Ye Luo
69fbdcb0b1
Add unit test runner.
2021-05-23 09:58:06 -05:00
Ye Luo
01a13df4b2
Relocate unit tests.
2021-05-02 14:43:11 -05:00
Daniele Cesarini
042a8ca1a0
CMake fixes for shared lib builds and added a CI build for NVHPC
2021-04-06 19:07:49 +00:00
Andrea Ferretti
9a69f149b9
more files related to PP moved from PW/src to upflib
2021-02-15 10:09:20 +01:00
giannozz
7840f39805
Merge branch 'fix_trace' into 'develop'
...
fix trace initialization separating the clock initialization from the setting of the max_detph of the trace
See merge request QEF/q-e!1301
2021-01-29 16:02:24 +00:00
giannozz
9bcb7ce91f
Merge branch 'add-ctest-run-without-check' into 'develop'
...
[CMake] keep both dependency logic in ctest
Closes #270
See merge request QEF/q-e!1299
2021-01-29 13:57:00 +00:00
Pietro Delugas
4615a9d00c
adding independent routine for setting max_print_depth value
2021-01-29 09:30:36 +01:00
Ye Luo
f24b4d583c
Fix UtilXlib tests GPU build.
2021-01-28 19:26:29 -06:00
Pietro Delugas
e36cb4cfa5
completin MR with changes suggested by reviewers
2021-01-27 00:19:34 +01:00
Pietro Delugas
747bbf0d76
adding the calls to nvidia profiler plugin in the main code
...
they plugin is compiled only whe the __NV_PROFILE preprocessor file is defined.
2021-01-25 15:35:37 +01:00
Ye Luo
7b94aa9b36
Merge branch 'fftxlib-error' into 'develop'
...
Fix error handling in FFTXlib
See merge request QEF/q-e!1291
2021-01-24 16:58:10 +00:00
Ye Luo
4d8d452630
Typo correction.
2021-01-24 10:24:27 -06:00
Paolo Giannozzi
229be57edb
Script for dependencies and make.depend updated
2021-01-23 22:32:58 +01:00
Daniele Cesarini
d462b73a6a
Fixed cmake files for cpu-gpu merge
2021-01-23 12:42:52 +01:00
Pietro Delugas
db0da8b0d9
Merge branch 'merge_qegpu' into HEAD
2021-01-22 17:20:50 +01:00
Pietro Bonfa
94b889c198
cuda_util is now part of devxlib
2021-01-22 10:59:28 +01:00
Pietro Bonfa
0c285826d2
Merge branch 'develop' into syncqe8
2021-01-21 19:27:49 +01:00
Daniele Cesarini
586f66aadf
Introduce CUDA support in CMake with some refactoring.
2021-01-18 14:50:50 +00:00
Pietro Bonfa
af81968cf1
Merge branch 'develop' into syncqe6
2020-12-05 19:25:50 +01:00
Pietro Bonfa
1139c04387
New utilXlib interface for ParO (courtesy of Ivan Carnimeo)
2020-12-03 11:17:03 +01:00
Pietro Bonfa
9529540d8d
Merge branch 'develop' into syncqe5
2020-11-24 13:44:01 +01:00
Daniele Cesarini
b2a4a6b89f
Fixed shared library compilation
2020-11-23 12:16:32 +01:00
Pietro Bonfa
2da1f5ccf3
Forgot one file
2020-10-22 15:30:54 +02:00
Pietro Bonfa
a6401f2097
Removed old copy/pasted version of devxlib from utilxlib
2020-10-22 15:28:40 +02:00
Pietro Bonfa
05c866e91f
Merge branch 'develop' into syncqe2
2020-10-18 17:50:41 +02:00
Paolo Giannozzi
635ef1c9bf
[skip-CI] executables should be executable
2020-10-15 21:17:38 +02:00
Pietro Bonfa
1a4df64ffe
Merge branch 'develop' into syncqe
2020-10-04 16:33:16 +02:00
giannozz
d0d0d8d721
Merge branch 'remove-MPI-module' into 'develop'
...
Remove use of MPI module
See merge request QEF/q-e!1127
2020-10-01 19:24:05 +00:00
Ye Luo
5ca1cfd343
Remove use of MPI module.
2020-09-30 23:22:37 -05:00
Ye Luo
ccb30bc8db
Fix unit tests.
2020-09-30 23:14:23 -05:00
Ye Luo
8d7b692508
Move executables and static archives to bin lib
2020-09-30 13:42:46 -05:00
Daniele Cesarini
e736e1c01c
Fixed missing dependencies to OpenMP
2020-09-29 18:11:32 +02:00
Daniele Cesarini
fc09ef40e4
Removed cmake function preprocessing and replaced with _qe_add_global_target
2020-09-29 18:11:32 +02:00
Daniele Cesarini
90840d6caf
Fix preprocessor flags for Fortran files
2020-09-29 18:11:32 +02:00
Daniele Cesarini
9246f191ac
Restricted dependency visibility for cmake targets
2020-09-29 18:11:31 +02:00
Daniele Cesarini
d912e3905c
Added missing QE packagies to cmake
2020-09-29 18:11:31 +02:00
Federico Ficarelli
2adf2e3f44
Make qe_install_targets variadic
2020-09-29 18:11:30 +02:00
Federico Ficarelli
1a1bb304a5
Add missing sources to QE::UTILX
2020-09-29 18:11:30 +02:00
Federico Ficarelli
7598a46ff2
Add FFTXLib as QE::FFTX
2020-09-29 18:11:30 +02:00
Federico Ficarelli
c713248d91
Make qe_add_library work with interface targets
2020-09-29 18:11:30 +02:00
Federico Ficarelli
b4eb2fd490
Remove useless features target
2020-09-29 18:11:30 +02:00
Federico Ficarelli
ba61edaebb
Fix component names case
2020-09-29 18:11:30 +02:00
Federico Ficarelli
4593deec27
Add support for targets installation
2020-09-29 18:11:30 +02:00
Federico Ficarelli
9f58ebece4
Add CMake support for LAXLib
2020-09-29 18:11:29 +02:00
Federico Ficarelli
ce7c15c3b0
Make qe_install_targets variadic
2020-09-29 18:11:29 +02:00
Federico Ficarelli
6f5e0fb95a
Add missing sources to QE::UTILX
2020-09-29 18:11:29 +02:00
Federico Ficarelli
256bf99987
Add FFTXLib as QE::FFTX
2020-09-29 18:11:29 +02:00
Federico Ficarelli
241ad122e0
Make qe_add_library work with interface targets
2020-09-29 18:11:28 +02:00
Federico Ficarelli
bf4c480389
Remove useless features target
2020-09-29 18:11:28 +02:00
Federico Ficarelli
f01e1e21b0
Fix component names case
2020-09-29 18:11:28 +02:00
Federico Ficarelli
282558e285
Add support for targets installation
2020-09-29 18:11:28 +02:00
Federico Ficarelli
e0673accd6
Add CMake support for LAXLib
2020-09-29 18:11:28 +02:00
Pietro Bonfa
c65f32e8e7
Updated devXlib
2020-07-30 16:24:40 +02:00
Stefano de Gironcoli
f4096c8dfd
in a non blocking send it is not safe to assume that when the send
...
involves multiple chuncks these are sent in order... that is checking
the depature of the 'last' chunck only is not safe...
temporarily a hopefuly capacious array for storing all active send_requests
is allocated, if it is insufficient the code stops and complains.
a cleaner solution shuld be found.
2020-06-08 16:46:12 +02:00
Stefano de Gironcoli
45ab1cb566
add the possibility of a non-blocking send (mpi_isend) if a send_request
...
handle is added as optional argument (only for real and complex data types
but should be extendable if needed)
2020-06-08 16:46:12 +02:00
Stefano de Gironcoli
85aaff8f4a
point-to-point operations mp_send and mp_recv added to mp.f90
...
NB1) only cpu version has been implemented;
NB2) no barrier in this call even when __USE_BARRIER, if needed add them outside.
2020-06-08 16:46:11 +02:00
Stefano de Gironcoli
cc713f4331
removal of unused variable
2020-06-08 16:46:11 +02:00
Pietro Bonfa
db871e6906
Merge branch 'cp-gpu-milestone1' into gpu-develop
2020-05-06 13:00:41 +02:00
Pietro Bonfa
c4c2417226
Merge branch 'develop' into gpu-develop
2020-04-26 16:43:48 +02:00
Hyungjun Lee
5c120fb820
update mp_base.f90
2020-04-15 19:46:24 -05:00
carcava
73b7a967fa
Merge branch 'gpu-develop' into cp-gpu-milestone1
...
Conflicts:
CPV/src/init.f90
LAXlib/la_helper.f90
2020-04-16 00:28:10 +02:00
Pietro Bonfa
56c3090769
Merge commit '3c87bac5e67b5e30b9c5d7e7d3a69f9fb4285e1b' into gpu-develop
2020-04-11 19:05:36 +02:00
Pietro Bonfa
91dc1c504a
Merge commit '747995108ec2c873a483a3534d11ac44901ebdd0' into gpu-develop
2020-04-05 11:22:45 +02:00
Pietro
968d4df406
GPU timers
2020-03-21 17:41:10 +00:00
Pietro Bonfa
fd0905cdb7
Replaced custom calls with devxlib
2020-03-14 13:22:31 +01:00
carcava
38e4fc31b5
Merge remote-tracking branch 'origin' into cp-gpu
2020-03-13 17:23:53 +01:00
carcava
e25d65a139
- fix for CPU only compilation
2020-03-11 09:43:06 +01:00
Pietro Bonfa
3fd21fdad3
Always check if clock is running [skip ci]
2020-03-10 22:05:39 +01:00
carcava
657a7d2e12
more device variables, more helper subs, bug fix for OpenMP with intel compiler
2020-03-10 01:55:17 +01:00
carcava
8fe16a7bfe
adding device_helper.f90 to collect miscellaneous helper subroutines
2020-03-09 18:15:34 +01:00
Pietro Bonfa
74b0ff4e77
Better and more accurate timing
2020-03-07 15:24:46 +01:00
carcava
3b8224629f
Merge branch 'gpu-develop' into cp-gpu
2020-03-06 19:24:55 +01:00