Skip to content

Latest commit

 

History

History
76 lines (60 loc) · 1.57 KB

05-OpenMP-device-functions.md

File metadata and controls

76 lines (60 loc) · 1.57 KB
title event lang
OpenMP offloading: <br>device functions
CSC Summer School in High-Performance Computing 2024
en

OpenMP offloading: device functions {.section}

Function calls in compute regions

  • Often it can be useful to call functions within loops to improve readability and modularisation
  • By default OpenMP does not create accelerated regions for loops calling functions
  • One has to instruct the compiler to compile a device version of the function

Directive: declare target

  • Define a function to be compiled for an accelerator as well as the host
  • C/C++
    • enclose function declaration within #pragma omp declare target and #pragma omp end declare target
  • Fortran
    • use !$omp declare target within the subroutine
  • The functions will now be compiled both for host and device execution

Example: declare target

**C/C++** ```c #pragma omp declare target void foo(float* v, int i, int n) { for ( int j=0; j#pragma omp target teams loop for (int i=0; i<n; ++i) { foo(v,i); // executed on the device }

</div>

<div class="column">
**Fortran**
```fortranfree
subroutine foo(v, i, n)
  !$omp declare target
  real :: v(:,:)
  integer :: i, n

  do j=1,n
     v(i,j,n) = 1.0/(i*j)
  enddo
end subroutine

!$omp target teams loop
do i=1,n
  call foo(v,i,n)
enddo
!$omp end target teams loop

Summary

  • Declare target directive
    • Enables one to write device functions that can be called within parallel loops