New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link-time optimization and inlining #328

Closed

sbryngelson opened this issue Feb 2, 2024 · 0 comments · Fixed by #581

Assignees

Labels

enhancement good first issue question

Member

sbryngelson commented Feb 2, 2024 •

edited

Loading

A possible fix for the performance degradation seen on NVHPC when calling subroutines across modules: https://forums.developer.nvidia.com/t/nvhpc-23-11-fortran-does-it-inline-public-subroutines-across-modules/281047/2

Though for cross-file inlining you can try the two-pass method. First create an inline extract library but compiling all files with “-Mextract=lib:libname” replacing “libname” with what you’d like to call it. Then compile with “-Minline=lib:libname” to use the extract library. Inlining is performed prior to the device code generation.

sbryngelson added enhancement good first issue question labels

sbryngelson assigned AiredaleDev

AiredaleDev mentioned this issue

Two-stage compilation to allow cross-module inlining on NVHPC #569

Closed

10 tasks

henryleberre mentioned this issue

Two-stage IPO for NVHPC #581

Merged

henryleberre closed this as completed in #581

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment