-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frontier GPU-aware MPI #405
Labels
Comments
From some old 2021 docs: • Environment variable, CRAY_ACC_USE_UNIFIED_MEM=1
• CCE offloading runtime library will auto-detect user-allocations of pinned or managed memory
• No explicit allocations or transfers will be issued for such memory
• Original pointers passed directly into GPU kernels
• CRAY_ACC_DEBUG runtime messages reflect this capability https://www.olcf.ornl.gov/wp-content/uploads/2021/04/2021-05-20-Frontier-Tutorial-CCE.pdf |
From here: https://www.openmp.org/wp-content/uploads/2022-04-29-ECP-OMP-Telecon-HPE-Compiler.pdf CCE OPENMP UNIFIED SHARED MEMORY SUPPORT FOR AMD MI250X
Dynamically enable GPU unified memory for OpenMP map clauses
• Set env vars CRAY_ACC_USE_UNIFIED_MEM=1 and HSA_XNACK=1
• Skips explicit allocate/transfer for all system memory
• Global ”declare target” variables will still be allocated separately (compiler statically emits a device copy)
• Statically enable GPU unified memory for OpenMP map clauses
• Compile with “requires unified_shared_memory” directive
• Set env var HSA_XNACK=1 |
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Support and document GPU-aware MPI on Frontier.
If not oversubscribing the GPUs, make sure we run with
-c 7 --gpus-per-task=1 --gpu-bind=closest
. Because the NIC is connected directly to the GPU, affinity is very important.Since we are already building everything in, it should just be a matter of setting the runtime flags.
MPICH_GPU_SUPPORT_ENABLED=1
The text was updated successfully, but these errors were encountered: