quantum-espresso/UtilXlib
giannozz b087838e6c Merge branch 'mpi_int8' into 'develop'
Better mp_reduce for int*8

Closes #758

See merge request QEF/q-e!2571
2025-02-18 20:57:48 +00:00
..
tests Make CCE fortran work with OpenMP offload. 2023-06-07 18:36:35 -04:00
CMakeLists.txt Fix CMake C-language targets. 2024-07-21 17:02:23 -05:00
Makefile clib/ is no more 2021-08-29 18:08:27 +00:00
Makefile.test
README.md [skip-CI] CP user guide updated 2021-09-02 18:44:12 +00:00
c_mkdir.c clib/ is no more 2021-08-29 18:08:27 +00:00
clib_wrappers.f90 Move Modules/wrappers.f90 to UtilXlib/clib_wrappers.f90 2021-06-19 19:44:23 -05:00
clocks_handler.f90 fix nvtx markers in start_clock_gpu 2022-04-06 10:15:51 +02:00
copy.c clib/ is no more 2021-08-29 18:08:27 +00:00
cptimer.c clib/ is no more 2021-08-29 18:08:27 +00:00
data_buffer.f90
device_helper.f90 Clean up 2025-01-22 14:24:06 +01:00
divide.f90
error_handler.f90 Small things 2023-03-16 14:59:02 +00:00
eval_infix.c clib/ is no more 2021-08-29 18:08:27 +00:00
export_gstart_2_solvers.f90
find_free_unit.f90
fletcher32.c clib/ is no more 2021-08-29 18:08:27 +00:00
fletcher32_mod.f90
hash.f90
md5.c clib/ is no more 2021-08-29 18:08:27 +00:00
md5.h clib/ is no more 2021-08-29 18:08:27 +00:00
md5_from_file.c clib/ is no more 2021-08-29 18:08:27 +00:00
mem_counter.f90
memstat.c More obsolete stuff 2024-01-30 17:13:14 +01:00
memusage.c clib/ is no more 2021-08-29 18:08:27 +00:00
mp.f90 Add mp_sum_i8 for one single long integer. 2025-02-12 22:27:04 -05:00
mp_bands_util.f90
mp_base.f90 Better mp_reduce for int*8 2025-02-12 11:42:28 +01:00
mp_base_gpu.f90 Better mp_reduce for int*8 2025-02-12 11:42:28 +01:00
nvtx_wrapper.f90 Move Modules/wrappers.f90 to UtilXlib/clib_wrappers.f90 2021-06-19 19:44:23 -05:00
parallel_include.f90
print_mem.f90 Fix GNU warning. 2021-06-27 13:28:11 -05:00
ptrace.c clib/ is no more 2021-08-29 18:08:27 +00:00
set_mpi_comm_4_solvers.f90 revert few unwanted changes 2023-09-11 18:39:33 +02:00
thread_util.f90
util_param.f90 Adding RMM-DIIS cpu version 2021-08-12 20:08:31 +02:00

README.md

UtilXlib

This library implements various basic tasks such as timing, tracing, optimized memory accesses and an abstraction layer for the MPI subroutines.

The following pre-processor directives can be used to enable/disable some features:

  • __MPI : activates MPI support.
  • __TRACE : activates verbose output for debugging purposes
  • __CUDA : activates CUDA Fortran based interfaces.
  • __GPU_MPI : use CUDA aware MPI calls instead of standard sync-send-update method (experimental).

Usage of wrapper interfaces for MPI

This library offers a number of interfaces to abstract the MPI APIs and to optionally relax the dependency on a MPI library.

mp_* interfaces present in the library can only be called after the initialization performed by the subroutine mp_start and before the finalization done by mp_end. All rules have exceptions and indeed subroutines mp_count_nodes, mp_type_create_column_section and mp_type_free can also be called outside the aforementioned window.

If CUDA Fortran support is enabled, almost all interfaces accept input data declared with the device attribute. Note however that CUDA Fortran support should be considered experimental.

CUDA specific notes

All calls to message passing interfaces are synchronous with respect to both MPI and CUDA streams. The code will synchronize the device before starting the communication, also in those cases where communication may be avoided (for example in serial version). A different behaviour may be observed when the default stream synchronization behaviour is overridden by the user (see cudaStreamCreateWithFlags).

Be careful when using CUDA-aware MPI. Some implementations are not complete. The library will not check for the CUDA-aware MPI APIs during the initialization, but may report failure codes during the execution. If you encounter problems when adding the flag __GPU_MPI it might be that the MPI library does not support some CUDA-aware APIs.

Testing

Partial unit testing is available in the tests sub-directory. See the README.md file in that directory for further information.