New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic CUDA driver loader #1841
base: master
Are you sure you want to change the base?
Conversation
Hi @didzis this is a great idea and I'm giving it a try now on Windows but without success. I built the Windows DLL with |
Hi, this was implemented only for non-Windows systems, but I made an attempt to support Windows platform in the latest commit. I don't have any means to test it myself. You may need to change the driver DLL name. Note that, there is a comment stating that there is no static cuBLAS library available since CUDA Toolkit 12.3.1 and thus static linking for cuBLAS is disabled. If this approach works for you, then some version check for older CUDA Toolkits may solve this. It should work as is with the dynamic cuBALS library, just that dynamic linking against any CUDA library defeats the purpose of all this. |
@ggerganov, here it is possible to embed the contents of |
My goal in the long term to address this is to move the backends to dynamic libraries loadable at run time, then we could use a single build for all the backends. I don't think this is going to work on Windows for the reasons already mentioned, some CUDA libraries do not have static versions in Windows, so the executable will depend on the CUDA dlls regardless. |
Ok, to me it seems better to aim for the more general solution and for now not merge this change. |
I didn't want to step into Windows realm with this PR as it was intended a Linux only feature. Thus I reverted this PR to Linux only solution. Also I checked multiple CUDA Toolkit Windows releases and unfortunately it is the case mentioned before - cuBLAS static libraries are missing from Windows release. The general solution mentioned above is great, however there are some disadvantages with it:
With this PR the CUDA code is made optional by dynamically load only the A quote from the NVIDIA documentation here:
The static cuBLAS library itself does the same - loads Although there are no native cuBLAS static library for Windows available, CUDA can be used with Windows Subsystem for Linux 2 which is a Linux system and this PR still applies out-of-box:
I understand that there are no other options left for native Windows applications, but I fail to see any reason not to have both approaches supported for Linux platform (or WSL 2 on Windows). @ggerganov I believe it's worth to still consider merging this optional (and small) feature in one form or another (i.e., the solution can also be merged into |
4db8d4e
to
516a409
Compare
…ntime This approach lets CUDA enabled binaries to run on systems without CUDA supported GPUs and fall back to alternative computation methods.
This PR implements optional dynamic CUDA driver loader and static linking against CUDA runtime.
As a result CUDA enabled binaries can run without recompilation on systems with or without CUDA supported GPUs (and CUDA driver) with fallback to alternative computation methods.