Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a plan to support NPU? #30

Open
miracle777 opened this issue May 31, 2024 · 28 comments
Open

Is there a plan to support NPU? #30

miracle777 opened this issue May 31, 2024 · 28 comments

Comments

@miracle777
Copy link

Snapdragon X Elite and Core Ultra have NPUs built-in. However, it seems that the capabilities of the NPU can only be utilized with software that supports it. Does LM Studio have any plans to support these NPUs?

With the new Copilot+PC standard computers, the GPU is integrated into the CPU, and the VRAM shares memory with the main RAM. As a result, the graphics capability is a concern compared to laptops with NVIDIA GPUs. Therefore, even though these computers are touted as AI-ready, some use NPUs instead of GPUs. It would be very helpful if LM Studio could also support NPUs.

@vip7009pro
Copy link

same question

@KXLF
Copy link

KXLF commented Jul 16, 2024

希望可以支持Snapdragon X Elite的NPU

@mattwltr
Copy link

mattwltr commented Aug 1, 2024

Bump. What's the plan to support Snapdragon's X Elite NPU in the Technical Preview?

@robvoi
Copy link

robvoi commented Sep 17, 2024

Pumping :-)
An outlook would be great.

@louis-thevenet
Copy link

Up

@trigger2k20
Copy link

Hello, any update on this ?

@louis-thevenet
Copy link

From what I read, we basically have to wait until NPU support is added to llamacpp since it's what lmstudio uses for inference.
Here is a related issue: ggml-org/llama.cpp#9181

@dav3shanahan
Copy link

Any update on the NPU?

@dav3shanahan
Copy link

when will this be released? https://www.tiktok.com/@shanselman/video/7439403306352528671

@francoglov
Copy link

bump

@e-hamza
Copy link

e-hamza commented Dec 16, 2024

up

@exsilium
Copy link

daily bump.

@GTMoraes
Copy link

I also would love it

@jpeponis
Copy link

Now that 3.6 is out, it would be really cool for a beta update for NPU. I'd like my Ultra 155H NPU not to go to waste.

@vikrantSinghOnGithub
Copy link

seems like, very soon we are going to get build which run can run lmstudio on npu. https://www.linkedin.com/posts/shanselman_wow-check-this-out-its-lmstudioai-running-activity-7265044845801418752-Cc1p?utm_source=share&utm_medium=member_android

However i have tried running the llm on npu with AnythingLlM doesn't have big jump in compare to cpu

@louis-thevenet
Copy link

seems like, very soon we are going to get build which run can run lmstudio on npu. https://www.linkedin.com/posts/shanselman_wow-check-this-out-its-lmstudioai-running-activity-7265044845801418752-Cc1p?utm_source=share&utm_medium=member_android

However i have tried running the llm on npu with AnythingLlM doesn't have big jump in compare to cpu

Running on the NPU is probably more efficient than on the CPU tho

@DMontgomery40
Copy link

DMontgomery40 commented Feb 5, 2025

any update on this that i may have missed?

for running inference on yolov9c_relu_int8_320 (object detection) in openvino, it is insanely efficient:

Image

Image

@awaLiny2333
Copy link

+1 Need support to Intel NPU ヘ( ̄ω ̄ヘ)

@GTMoraes
Copy link

GTMoraes commented Feb 6, 2025

any update on this that i may have missed?

for running inference on yolov9c_relu_int8_320 (object detection) in openvino, it is insanely efficient:

Image

Image

AFAIK only AnythingLLM is able to use the NPU for running a model.
I recall they mentioned that the models need to be specifically made to run on the NPU, and they only supported the Qualcomm's NPU for now.

There are just two Llama 8B models to choose, when running through the NPU.

@ofoacimr
Copy link

Microsoft used ONNX QDQ format to make LLMs running on NPU, with AI Toolkit VS Code extension.

https://blogs.windows.com/windowsdeveloper/2025/01/29/running-distilled-deepseek-r1-models-locally-on-copilot-pcs-powered-by-windows-copilot-runtime/

While the Qwen 1.5B release from DeepSeek does have an int4 variant, it does not directly map to the NPU due to presence of dynamic input shapes and behavior – all of which needed optimizations to make compatible and extract the best efficiency. Additionally, we use the ONNX QDQ format to enable scaling across a variety of NPUs we have in the Windows ecosystem. We work out an optimal operator layout between the CPU and NPU for maximum power-efficiency and speed.

@dav3shanahan
Copy link

dav3shanahan commented Feb 14, 2025 via email

@DMontgomery40
Copy link

If you download AI toolkit in Visual Studio Code you can run DeepSeek r1
using the NPU locally

thanks, just on windows though? anything for linux?

@dav3shanahan
Copy link

May be be there is more info here; https://code.visualstudio.com/docs/setup/linux

@DMontgomery40
Copy link

May be be there is more info here; https://code.visualstudio.com/docs/setup/linux

thanks, that supports the qualcomm NPU, not Intel, but appreciate it either way. Unfortunately Intel locked it's NPU into a tight little Openvino Toolkit world. It's actually fantastic for image recognition and object detection. And that is all, apparently lol

@oising
Copy link

oising commented Feb 17, 2025

Another MS SL7/arm64 + npu owner here. Bump!

@AleksJev
Copy link

+1 Intel Ultra 9 185H. Intel Al Boost.

@nebulakid
Copy link

+1 please 🙏

@grabar-prog
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests