Releases: ROCm/aomp
rocm-7.0.2
ROCm release v7.0.2
AOMP Release 22.0-1
These are the release notes for AOMP 22.0-1. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called "amd-staging". This branch is found in a mirror of upstream LLVM found at https://github.com/ROCm/llvm-project. The amd-staging branch is constantly changing as it merges the upstream development trunk with its downstream development updates. The AMD modifications are experimental while under review for the upstream trunk. AOMP uses a snapshot of amd-staging at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (amdgpu-dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and the use of RPATH for runtime libraries.
For AOMP 22.0-1, the last LLVM trunk commit is 99c741ea473ad47aef50e4be1e6be094741151b2 on September 30, 2025. The last amd-only commit is d96c32cdaf588903917a9e7db172729759e34c9d on September 30, 2025. These commits form a frozen branch now called "aomp-22.0-1". See https://github.com/ROCm/llvm-project/tree/aomp-22.0-1.
The integrated ROCm components for this AOMP release were built with ROCM 7.0.1 sources.
This is the 1st AOMP release based on upstream LLVM 22 development.
These are the changes since 21.0-1:
- Switch to ROCm 7.0.1 sources
- Add BUILD_ICD to build_hipamd.sh to support OpenCL
- Use of aomp-shellcheck on build scripts
- Merge aomp-shell-format
- Rename aomp-shell-format to aomp-shellcheck
- Run aomp-shellcheck on aomp_common_vars and dependent build scripts
- Unit testing for shell scripts
- Change RCCL build from install.sh to traditional build
- Cleanup to remove AOMP_APPLY_ATD_AMD_STAGING_PATCH
- Deprecated rocprof and rocprofv2 in favor of rocprofv3 from rocprofiler-sdk
flang updates:
- Support for complex math intrinsics in target offload regions
- Reduction support for do concurrent
- Mapping improvements
- Add canonical loop operations
- Allow cycle in target teams distribute [simd]
- Additional support for debug in target regions
- Improved alias analysis
- Support directive spellings introduced in OpenMP 6.0
- Added atomic controls options for OpenMP offload:
-f[no-]atomic-remote-memory
-f[no-]atomic-fine-grained-memory
-f[no-]atomic-ignore-denormal-mode
-f[no-]ignore-denormal-mode
-m[no-]unsafe-fp-atomics (alias for -f[no-]ignore-denormal-mode) - Added support for complex pow
- Add 6.1 as a valid OpenMP version
- Implement LOWER= argument for C_F_POINTER (Fortran 2023)
- Implement !$omp unroll using omp.unroll_heuristic
- For
do concurrent
enable delayed localization by default - Fixed issue with named constants in SHARED and FIRSTPRIVATE clauses
- Don't privatize implicit symbols declare by nested
BLOCK
s - Reassociate ATOMIC update expressions (currently integer type only)
- Parse OpenMP 6.0 map modifiers
- Add -fintrinsic-modules-path= alias
- Support -gsplit-dwarf
- Add support for -ffast-real-mod
- Implicitly map nested allocatable components in derived types
do concurrent
: supportreduce
on devicedo concurrent
: supportlocal
on device- Preserve to/from flags in mapper base entry for mappers
- Add new ConvertComplexPow pass for Flang
- Support -flto-partitions=N and -f[no]fat-lto-objects
- Support -gdwarf-N option
- Extend
do concurrent
mapping to device - Enable tiling - supports tripcount not a multiple of tile size
- Fix mapping of character type with LEN > 1 specified
- Fix default firstprivatization miscategorization of mod file symbols
- Reassociate logical ATOMIC update expressions
- Reassociate floating-point ATOMIC update expressions (with -ffast-math)
- Add SPMD-No-Loop mode to OpenMP offload runtime
- requires setting both of the following flags:
-fopenmp-assume-teams-oversubscription
-fopenmp-assume-threads-oversubscription
- requires setting both of the following flags:
Errata:
- Flang failure seen in GenASiS/SPEC (error: failure in HLFIR intrinsic simplification)
rocm-6.4.4
ROCm release v6.4.4
rocm-7.0.1
ROCm release v7.0.1
rocm-7.0.0
ROCm release v7.0.0
rocm-6.4.3
ROCm release v6.4.3
rocm-6.4.2
ROCm release v6.4.2
AOMP Release 21.0-1
These are the release notes for AOMP 21.0-1. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called "amd-staging". This branch is found in a mirror of upstream LLVM found at https://github.com/ROCm/llvm-project. The amd-staging branch is constantly changing as it merges the upstream development trunk with its downstream development updates. The AMD modifications are experimental while under review for the upstream trunk. AOMP uses a snapshot of amd-staging at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (amdgpu-dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and the use of RPATH for runtime libraries.
For AOMP 21.0-1, the last LLVM trunk commit is 3009aa75cae240fc400c65c748a366d584998f9d on May 13, 2025. The last amd-only commit is a6d9cba0c648aaeb0d637963b1545829a380a3ea on May 13, 2025. These commits form a frozen branch now called "aomp-21.0-1". See https://github.com/ROCm/llvm-project/tree/aomp-21.0-1.
The integrated ROCm components for this AOMP release were built with ROCM 6.4.0 sources.
This is the 2nd AOMP release based on upstream LLVM 21 development.
Changes since AOMP 21.0-0:
- Integrated ROCm components are built with ROCM 6.4 sources whereas in AOMP 21.0-0 they were built with ROCM 6.3
- The binary name for the LLVM Fortran compiler driver is flang-21 whereas in AOMP 21.0-0, it was flang-new. The symbolic link flang now links to the new binary name flang-21
- The AOMP hip-libraries package now includes rocRAND, hipRAND, and half. AOMP testing has been expanded to include HeCBench
- MI300 xnack issues have been resolved.
- Updates to ROCr patch for deprecated gfx ids.
- Fixed Kokkos build ver v3.7.01
- Added smoke tests for
target firstprivate
- Limited release build of hip-libraries to gfx90a, gfx942, gfx1103, and gfx1150.
- Several flang (LLVM 21) updates and fixes:
- Fix do-concurrent
- Fix reduction of a single element
- Fix parallel regions with live-out values
- Fix combined target parallel
- Add support for pooled memory allocator
- Replace cmake FLANG_INCLUDE_RUNTIME option with LLVM_ENABLE_RUNTIMES=flang-rt
- Device side Fortran runtime available via -lflang_rt.hostdevice (previously -lFortranRuntimeHostDevice)
- Add hipfort support
- I/O from device supported
- Debug supported on host, initial support provided for target routines
- Support
bind
clause onloop
andteams loop
- Support
reduction
onloop
directives - Added support for
target firstprivate
for included target tasks - Generate math ops for non-precise acos, acosh, asin, asinh, atan, atanh, erfc intrinsic calls
- Allow declare target to be used on functions external to the declare targets scope
- Moved the gpurun utility from the aomp-extras repository to the utils directory of aomp. Eventually we will eliminate the aomp-extras repository.
- Restored optimization options for build of OpenMP Device RTL.
- Added environment variables for controlling buffer flush, OMPX_FlushOnBufferFull and OMPX_FlushOnShutdown
- Replaced libomptarget.devicertl.a with target specific bitcode libraries.
AOMP Release 21.0-0
These are the release notes for AOMP 21.0-0. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called the "amd-staging". This branch is found in a mirror of upstream LLVM found at https://github.com/ROCm/llvm-project. The amd-staging branch is constantly changing as it merges the upstream development trunk with its downstream development updates. The AMD modifications are experimental while under review for the upstream trunk. AOMP uses a snapshot of amd-staging at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (amdgpu-dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and the use of RPATH for runtime libraries.
For AOMP 21.0-0, the last LLVM trunk commit is 9cdab16da99ad9fdb823853fbc634008229e284f on March 31, 2025. The last amd-only commit is e9b040d02cd3f5e5dae032e7d15d934ea6486d18 on April 1, 2025. These commits form a frozen branch now called "aomp-21.0-0". See https://github.com/ROCm/llvm-project/tree/aomp-21.0-0.
The integrated ROCm components for this AOMP release were built with ROCM 6.3.3 sources.
This is the 1st AOMP release based on upstream LLVM 21 development.
Changes since AOMP 20.0-2:
- In this release, the FORTRAN flang-classic compiler is replaced with the new LLVM compiler (flang-new). Flang-new is built using the LLVM 21 trunk plus changes in the amd-staging branch. In addition to improved performance flang-new, supports print and write statements in the target region to support user diagnostics. The existence of any print or write statement in the target region will trigger a service thread that could impact performance, even if the print or write statements are not executed.
- The hipfort component built with flang-new has returned to aomp. Hipfort provides FORTRAN module interfaces to the HIP API and to many other hip math libraries. There are new examples in the examples directory to demonstrate hipfort.
- Improved performance on min and max reductions using fmin and fmax functions to define the reduction.
- Replacement of the amd-stging hostexec infrastructure with the upstream offload rpc mechanism.
- A new infrastructure for executing host API's in target regions called "Emissary APIs". Emissary APIs use the offload rpc mechanism to transparently execute functions called from a target region on the host. Emissary APIs exist for print, FORTRAN runtime, MPI, and HDF5. MPI and HDF5 are currently placeholders requiring more development to make them functional. The Emissary API for print includes printf, fprintf, and asan exception reporting. The Emissary API for the FORTRAN runtime supports print, write, stop, and abort FORTRAN statements.
- In this release, all OpenMP toolchains (c, c++, and FORTRAN) use a tool called clang-linker-wrapper as the default. This is a single command generated for host and device linking. Previously a multi-step process was used by the LLVM command driver. This multi-step process is still available with the --opaque-offload-linker command line option. Since clang-linker-wrapper obscures the process of device linking --opaque-offload-linker can be used to see the transformations from heterogeneous objects to fully linked device and host executable.
- This release uses the sources from ROCM 6.3 components for non-compiler components. All llvm-project compiler components were built using the amd-staging branch with the above-mentioned commit hash.
- In this release, we started a process to cleanup the examples for the different programming models supported by the ROCm compiler. The new examples are 100% driven by Makefiles so that users can see the compiler commands and environment that they are run in. Since the examples are typically in a read-only installation directory. They can now be executed from an out-of-tree directory to avoid the need to copy them. For example "make -f /usr/lib/aomp/examples/openmp/reduction/Makefile run " will build and run the example
- A significant number of changes to the AOMP build infrastructure were done to both add flang-new build and remove flang-classic build.
- Merging non-upstream changes into the amd-staging branch now uses github pull requests. We no longer use gerrit for this purpose. Merging of github PRs still requires successful passing of psdb tests. Merging from upstream trunk is still possible and preferred.
Errata:
- The hip/lib_device example currently fails to build with a link error.
AOMP Release 20.0-2
These are the release notes for AOMP 20.0-2. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called the "amd-staging". This branch is found in a mirror of upstream LLVM found at https://github.com/ROCm/llvm-project. The amd-staging branch is constantly changing as it merges the upstream development trunk with its downstream development updates. The AMD modifications are experimental while under review for the upstream trunk. AOMP uses a snapshot of amd-staging at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (amdgpu-dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and the use of RPATH for runtime libraries.
For AOMP 20.0-2, the last LLVM trunk commit is c8c2574832ed2064996389e4259eaf0bea0fa7951 on January 29, 2025. The last amd-only commit is c273851a8de71cf3001ad8fdc5abcc829b591b45 on January 29, 2025. These commits form a frozen branch now called "aomp-20.0-2". See https://github.com/ROCm/llvm-project/tree/aomp-20.0-2.
The integrated ROCm components for this AOMP release were built with ROCM 6.3.2 sources.
This is the 3rd AOMP release based on upstream LLVM 20 development.
Changes since AOMP 20.0-1:
- Added build of math rocmlibs (aomp-hip-libraries). Currently only support the following architectures: gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a;gfx942;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1200;gfx1201
- Added optional aomp-hip-libraries package. This contains libraries for rocBLAS, rocPRIM, rocSPARSE, rocSOLVER, and hipBLAS.
- Added preproduction flang-new executable. Flang-classic is still default with a flang to flang-classic symbolic link.
- Moved to ROCm 6.3.2 sources for non-compiler related repositories.
Errata:
- flang-classic failures seen in fbabelstream and Nekbone.