MPS GPU not being used when running run_localGPT.py #430
Replies: 4 comments 2 replies
-
When you run this, is the |
Beta Was this translation helpful? Give feedback.
-
@PromtEngineer , Do you have any updates on what I can do to make this utilize mps? Sorry to bug you. I understand if you have some other matters to attend. |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm a beginner either. Your problem really so strange that I am confused too. But I can give you some suggestions for trying:
Sorry can't for helping you. |
Beta Was this translation helpful? Give feedback.
-
@Satyam7166-tech The embedding model is also using GPU and since an NVIDIA GPU is not present on your machine, that might slow down the usage of Mac GPU. Let's keep this issue open in case if someone finds a solution. |
Beta Was this translation helpful? Give feedback.
-
@PromtEngineer
System: M1 pro
Model: TheBloke/Llama-2-7B-Chat-GGML
Here is my GPU usaage when I run ingest.py (with mps enabled)
And now look at the GPU usage when I run run_localGPT.py (with mps enabled)
The spike is very thick (ignore the previous thick spike. It is denoting ingest) and happens just about 2 seconds before the LLM generates the answer.
And here is the message specifying that mps is being used.
I am a complete beginner and any input will be greatly appreciated.
Thanks
Beta Was this translation helpful? Give feedback.
All reactions