Fix for Nvidia installed deps detection algorithm in gpu.go #4106
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related to #3593, #4008
In brief: I faced a similar issue and after a long analysis I found the problem in the current version of gpu.go code.
Context:
Currently, the code is looking for the cudart64_*.dll library in the following folders:
var CudartWindowsGlobs = []string{ "c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll" }
The problem is that among the paths in (3) there is a path to the PhysX installation folder (installed by default with Nvidia drivers).
The PhysX installation contains cudart64_*.dll, but there are two related (and necessary for ollama) libraries missing: cublas and cublasLt.
This causes the scanner to constantly detect this folder when starting chat:
(when there are no any other cudart at all in the system)
(and when everything is in place, among legitimate paths for cudart)
The problem doesn't manifest itself so obviously on most devices now because:
Everything would be fine for now if there was no such thing as PhysX Legacy, which is installed together with some old games and applications.
In this case, it turns out that the portable ollama on run prioritizes the cudart from the PhysX Legacy folder even over the cudart in the folder where the portable ollama .exe running from. This leads to an unhandled error and a crash of the ollama server (exact case of #3593) because once it finds the cudart in the PhysX folder it cannot then discover expected cublas and cublasLt libs needed for the LLM to work.
To summarize: the current implementation of gpu.go may shoot up in the future. Even now, it creates malfunctions, provided 3 conditions are met:
In any case, trying to index the PhysX folder as legitimately containing the necessary Nvidia libraries is an undesigned behavior leading to crashes on startup of chatting session. There are 2 required libraries missing in any version of PhysX.
Also, the proposed code checks for the presence of these very libraries in the scanned folders to avoid similar cases, even not related to PhysX.