Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link-time optimization and inlining #328

Closed
sbryngelson opened this issue Feb 2, 2024 · 0 comments · Fixed by #581
Closed

Link-time optimization and inlining #328

sbryngelson opened this issue Feb 2, 2024 · 0 comments · Fixed by #581
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers question Further information is requested

Comments

@sbryngelson
Copy link
Member

sbryngelson commented Feb 2, 2024

A possible fix for the performance degradation seen on NVHPC when calling subroutines across modules: https://forums.developer.nvidia.com/t/nvhpc-23-11-fortran-does-it-inline-public-subroutines-across-modules/281047/2

Though for cross-file inlining you can try the two-pass method. First create an inline extract library but compiling all files with “-Mextract=lib:libname” replacing “libname” with what you’d like to call it. Then compile with “-Minline=lib:libname” to use the extract library. Inlining is performed prior to the device code generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers question Further information is requested
2 participants