-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build failure: resulting binary over 2GB #39
Comments
Could you give some context here to reproduce the issue? The best would be a MAGMA Makefile (make.inc) or CMake configuration that reproduces it. I saw the Nixpkgs issue, but it isn't clear there how you are building MAGMA. It appears that you are compiling for all these CUDA capabilities: |
Right
Maybe @ConnorBaker can explain why we do this? |
The GPU selection done in Nixpkgs is fairly naive and driven by https://github.com/NixOS/nixpkgs/blob/daa2a442b9c82a265afaedbdf5589adadc01095c/pkgs/development/cuda-modules/gpus.nix Essentially, every capability supported by a CUDA version is added by default to maximize compatibility and performance. (This is also done in part from reduce load on CI, as specifying a different list of capabilities involves rebuilding all CUDA-enabled packages.) I recommended that users configure Nixpkgs to target their specific capability. Unfortunately, that generally requires they rebuild everything locally as CI only caches packages built with the default configuration. I’ve not done any research into the various capabilities to find out what has changed between them (excluding the floating point operation speed up between 8.0 and 8.6), so that would be a good place to start if we wanted to cull additional capabilities from the default set. |
This is related to pytorch/pytorch#39968 which got resolved in pytorch/pytorch#49050 by splitting the library into smaller ones
I am also including the Nixpkgs issue for more details: NixOS/nixpkgs#239237
In Nixpkgs we used
-Xfatbin=-compress-all
first (same as pytorch), but we are again hitting the 2G limit. Using-mcmodel=large
does not seem to help, so I guess the only way is to split the magma library into smaller ones too (same approach as pytorch)On
x86_64-linux
:On
aarch64-linux
:The text was updated successfully, but these errors were encountered: