GPU markers, such as NVTX (NVIDIA) or ROCtx (AMD) are meant to be used
for profiling purposes, along specific profiling tools (Nsight Systems,
Radeon profiler, Perfetto UI...).
As such, they are mostly used for development or benchmarking purposes
and aren't of any use for the end-user of ABINIT.
This commit changes the way one enables GPU Markers:
- input parameter "gpu_use_nvtx" is removed
- configure option "with_gpu_markers/--with-gpu-markers" is added
- CMake option "-DABINIT_ENABLE_GPU_MARKERS" is added
The way GPU markers were handled in ABINIT code doesn't change as this
feature was already hidden and protected by define HAVE_GPU_MARKERS.
We perform this init step here for the same reason as rocBLAS, that is
to not spend abnormal time on the first step when FFT is call for the
first time.
CMake takes this flag in Release configuration and Kokkos uses it
through nvcc_wrapper that pass it to g++ which doesn't recognize this
flag and trigger an error.
"-fast" is usually "-O2" + extra flags so replacing it with "-O2" should
affect performance to much.
For now, OpenMP GPU offload is always enabled if compiler succeeds the
test for checking OpenMP 5 compliance.
CMake doesn't have support for OpenMP target flags yet, hence I
copy-pasted a function that handles the job for many compilers through
only NVHPC is actually ABINIT-proofed for now.