MPS GPU not being used when running run_localGPT.py #430

Satyam7166-tech · 2023-08-28T11:22:55Z

Satyam7166-tech
Aug 28, 2023

System: M1 pro
Model: TheBloke/Llama-2-7B-Chat-GGML

Here is my GPU usaage when I run ingest.py (with mps enabled)

And now look at the GPU usage when I run run_localGPT.py (with mps enabled)

The spike is very thick (ignore the previous thick spike. It is denoting ingest) and happens just about 2 seconds before the LLM generates the answer.
And here is the message specifying that mps is being used.

I am a complete beginner and any input will be greatly appreciated.
Thanks

PromtEngineer · 2023-08-29T07:23:17Z

PromtEngineer
Aug 29, 2023
Maintainer

When you run this, is the BLAS=1?

1 reply

Satyam7166-tech Aug 30, 2023
Author

Yes, it is.

Satyam7166-tech · 2023-09-03T10:28:08Z

Satyam7166-tech
Sep 3, 2023
Author

@PromtEngineer , Do you have any updates on what I can do to make this utilize mps?

Sorry to bug you. I understand if you have some other matters to attend.

0 replies

zouy50 · 2023-09-04T02:30:27Z

zouy50
Sep 4, 2023

Hi, I'm a beginner either. Your problem really so strange that I am confused too. But I can give you some suggestions for trying:

you can run this with terminal not in IDE
you can use the other model try again
you can run model without this project, such as only run model with llamacpp
you can change the documents, such as a simple one
check your system settings maybe some config make this problem, for example, the power manage, the other application settings affecting

Sorry can't for helping you.

1 reply

Satyam7166-tech Sep 4, 2023
Author

Thank you so much for your efforts.

My friend was facing the same error on his mac so I think the problem is with our setup.
Only using llama.cpp was working for us (mps usage) but connecting it to vector database is where the issues began.

Though for the past couple of days, I've been playing around with Ollama and now thankfully, it's working for my use case.
Again, thanks for your efforts.

Do you think I should close the issue now? I am not sure of the protocol.

PromtEngineer · 2023-09-05T00:51:55Z

PromtEngineer
Sep 5, 2023
Maintainer

@Satyam7166-tech The embedding model is also using GPU and since an NVIDIA GPU is not present on your machine, that might slow down the usage of Mac GPU. Let's keep this issue open in case if someone finds a solution.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPS GPU not being used when running run_localGPT.py #430

{{title}}

Replies: 4 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

MPS GPU not being used when running run_localGPT.py #430

Satyam7166-tech Aug 28, 2023

Replies: 4 comments · 2 replies

PromtEngineer Aug 29, 2023 Maintainer

Satyam7166-tech Aug 30, 2023 Author

Satyam7166-tech Sep 3, 2023 Author

zouy50 Sep 4, 2023

Satyam7166-tech Sep 4, 2023 Author

PromtEngineer Sep 5, 2023 Maintainer

Satyam7166-tech
Aug 28, 2023

Replies: 4 comments 2 replies

PromtEngineer
Aug 29, 2023
Maintainer

Satyam7166-tech Aug 30, 2023
Author

Satyam7166-tech
Sep 3, 2023
Author

zouy50
Sep 4, 2023

Satyam7166-tech Sep 4, 2023
Author

PromtEngineer
Sep 5, 2023
Maintainer