Is there a plan to support NPU? #30

miracle777 · 2024-05-31T00:13:50Z

Snapdragon X Elite and Core Ultra have NPUs built-in. However, it seems that the capabilities of the NPU can only be utilized with software that supports it. Does LM Studio have any plans to support these NPUs?

With the new Copilot+PC standard computers, the GPU is integrated into the CPU, and the VRAM shares memory with the main RAM. As a result, the graphics capability is a concern compared to laptops with NVIDIA GPUs. Therefore, even though these computers are touted as AI-ready, some use NPUs instead of GPUs. It would be very helpful if LM Studio could also support NPUs.

vip7009pro · 2024-06-17T16:40:38Z

same question

KXLF · 2024-07-16T04:34:52Z

希望可以支持Snapdragon X Elite的NPU

mattwltr · 2024-08-01T10:54:31Z

Bump. What's the plan to support Snapdragon's X Elite NPU in the Technical Preview?

robvoi · 2024-09-17T13:48:19Z

Pumping :-)
An outlook would be great.

louis-thevenet · 2024-09-30T13:53:50Z

Up

trigger2k20 · 2024-10-07T20:26:37Z

Hello, any update on this ?

louis-thevenet · 2024-10-09T11:54:38Z

From what I read, we basically have to wait until NPU support is added to llamacpp since it's what lmstudio uses for inference.
Here is a related issue: ggml-org/llama.cpp#9181

dav3shanahan · 2024-11-23T12:11:12Z

Any update on the NPU?

dav3shanahan · 2024-12-03T18:36:05Z

when will this be released? https://www.tiktok.com/@shanselman/video/7439403306352528671

francoglov · 2024-12-13T08:03:10Z

bump

e-hamza · 2024-12-16T10:25:23Z

up

exsilium · 2024-12-17T11:21:15Z

daily bump.

GTMoraes · 2024-12-20T13:14:53Z

I also would love it

jpeponis · 2025-01-11T17:38:22Z

Now that 3.6 is out, it would be really cool for a beta update for NPU. I'd like my Ultra 155H NPU not to go to waste.

vikrantSinghOnGithub · 2025-01-18T16:25:56Z

seems like, very soon we are going to get build which run can run lmstudio on npu. https://www.linkedin.com/posts/shanselman_wow-check-this-out-its-lmstudioai-running-activity-7265044845801418752-Cc1p?utm_source=share&utm_medium=member_android

However i have tried running the llm on npu with AnythingLlM doesn't have big jump in compare to cpu

louis-thevenet · 2025-01-18T17:04:21Z

seems like, very soon we are going to get build which run can run lmstudio on npu. https://www.linkedin.com/posts/shanselman_wow-check-this-out-its-lmstudioai-running-activity-7265044845801418752-Cc1p?utm_source=share&utm_medium=member_android

However i have tried running the llm on npu with AnythingLlM doesn't have big jump in compare to cpu

Running on the NPU is probably more efficient than on the CPU tho

DMontgomery40 · 2025-02-05T03:19:04Z

any update on this that i may have missed?

for running inference on yolov9c_relu_int8_320 (object detection) in openvino, it is insanely efficient:

awaLiny2333 · 2025-02-05T20:48:25Z

+1 Need support to Intel NPU ヘ(￣ω￣ヘ)

GTMoraes · 2025-02-06T15:17:42Z

any update on this that i may have missed?

for running inference on yolov9c_relu_int8_320 (object detection) in openvino, it is insanely efficient:

AFAIK only AnythingLLM is able to use the NPU for running a model.
I recall they mentioned that the models need to be specifically made to run on the NPU, and they only supported the Qualcomm's NPU for now.

There are just two Llama 8B models to choose, when running through the NPU.

ofoacimr · 2025-02-14T18:46:10Z

Microsoft used ONNX QDQ format to make LLMs running on NPU, with AI Toolkit VS Code extension.

https://blogs.windows.com/windowsdeveloper/2025/01/29/running-distilled-deepseek-r1-models-locally-on-copilot-pcs-powered-by-windows-copilot-runtime/

While the Qwen 1.5B release from DeepSeek does have an int4 variant, it does not directly map to the NPU due to presence of dynamic input shapes and behavior – all of which needed optimizations to make compatible and extract the best efficiency. Additionally, we use the ONNX QDQ format to enable scaling across a variety of NPUs we have in the Windows ecosystem. We work out an optimal operator layout between the CPU and NPU for maximum power-efficiency and speed.

dav3shanahan · 2025-02-14T18:51:03Z

If you download AI toolkit in Visual Studio Code you can run DeepSeek r1 using the NPU locally

…

On Fri, Feb 14, 2025 at 1:46 PM ofoacimr ***@***.***> wrote: Microsoft used ONNX QDQ format to make LLMs running on NPU, with AI Toolkit VS Code extension. https://blogs.windows.com/windowsdeveloper/2025/01/29/running-distilled-deepseek-r1-models-locally-on-copilot-pcs-powered-by-windows-copilot-runtime/ While the Qwen 1.5B release from DeepSeek does have an int4 variant, it does not directly map to the NPU due to presence of dynamic input shapes and behavior – all of which needed optimizations to make compatible and extract the best efficiency. Additionally, we use the ONNX QDQ <https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html> format to enable scaling across a variety of NPUs we have in the Windows ecosystem. We work out an optimal operator layout between the CPU and NPU for maximum power-efficiency and speed. — Reply to this email directly, view it on GitHub <#30 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK7GOSCOMSDYP4ZG5MVPXCD2PY2YXAVCNFSM6AAAAABIR2ZJAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRQGAZTIMZQGI> . You are receiving this because you commented.Message ID: ***@***.***> [image: ofoacimr]*ofoacimr* left a comment (lmstudio-ai/lms#30) <#30 (comment)> Microsoft used ONNX QDQ format to make LLMs running on NPU, with AI Toolkit VS Code extension. https://blogs.windows.com/windowsdeveloper/2025/01/29/running-distilled-deepseek-r1-models-locally-on-copilot-pcs-powered-by-windows-copilot-runtime/ While the Qwen 1.5B release from DeepSeek does have an int4 variant, it does not directly map to the NPU due to presence of dynamic input shapes and behavior – all of which needed optimizations to make compatible and extract the best efficiency. Additionally, we use the ONNX QDQ <https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html> format to enable scaling across a variety of NPUs we have in the Windows ecosystem. We work out an optimal operator layout between the CPU and NPU for maximum power-efficiency and speed. — Reply to this email directly, view it on GitHub <#30 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK7GOSCOMSDYP4ZG5MVPXCD2PY2YXAVCNFSM6AAAAABIR2ZJAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRQGAZTIMZQGI> . You are receiving this because you commented.Message ID: ***@***.***>

DMontgomery40 · 2025-02-14T18:59:10Z

If you download AI toolkit in Visual Studio Code you can run DeepSeek r1
using the NPU locally
…

thanks, just on windows though? anything for linux?

dav3shanahan · 2025-02-14T20:47:23Z

May be be there is more info here; https://code.visualstudio.com/docs/setup/linux

DMontgomery40 · 2025-02-14T21:13:54Z

May be be there is more info here; https://code.visualstudio.com/docs/setup/linux

thanks, that supports the qualcomm NPU, not Intel, but appreciate it either way. Unfortunately Intel locked it's NPU into a tight little Openvino Toolkit world. It's actually fantastic for image recognition and object detection. And that is all, apparently lol

oising · 2025-02-17T18:14:37Z

Another MS SL7/arm64 + npu owner here. Bump!

AleksJev · 2025-02-26T12:01:33Z

+1 Intel Ultra 9 185H. Intel Al Boost.

nebulakid · 2025-03-19T19:22:50Z

+1 please 🙏

grabar-prog · 2025-03-26T10:28:40Z

+1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a plan to support NPU? #30

Is there a plan to support NPU? #30

miracle777 commented May 31, 2024

vip7009pro commented Jun 17, 2024

KXLF commented Jul 16, 2024

mattwltr commented Aug 1, 2024

robvoi commented Sep 17, 2024

louis-thevenet commented Sep 30, 2024

trigger2k20 commented Oct 7, 2024

louis-thevenet commented Oct 9, 2024

dav3shanahan commented Nov 23, 2024

dav3shanahan commented Dec 3, 2024

francoglov commented Dec 13, 2024

e-hamza commented Dec 16, 2024

exsilium commented Dec 17, 2024

GTMoraes commented Dec 20, 2024

jpeponis commented Jan 11, 2025

vikrantSinghOnGithub commented Jan 18, 2025

louis-thevenet commented Jan 18, 2025

DMontgomery40 commented Feb 5, 2025 •

edited

Loading

awaLiny2333 commented Feb 5, 2025

GTMoraes commented Feb 6, 2025

ofoacimr commented Feb 14, 2025

dav3shanahan commented Feb 14, 2025 via email

DMontgomery40 commented Feb 14, 2025

dav3shanahan commented Feb 14, 2025

DMontgomery40 commented Feb 14, 2025

oising commented Feb 17, 2025

AleksJev commented Feb 26, 2025

nebulakid commented Mar 19, 2025

grabar-prog commented Mar 26, 2025

Is there a plan to support NPU? #30

Is there a plan to support NPU? #30

Comments

miracle777 commented May 31, 2024

vip7009pro commented Jun 17, 2024

KXLF commented Jul 16, 2024

mattwltr commented Aug 1, 2024

robvoi commented Sep 17, 2024

louis-thevenet commented Sep 30, 2024

trigger2k20 commented Oct 7, 2024

louis-thevenet commented Oct 9, 2024

dav3shanahan commented Nov 23, 2024

dav3shanahan commented Dec 3, 2024

francoglov commented Dec 13, 2024

e-hamza commented Dec 16, 2024

exsilium commented Dec 17, 2024

GTMoraes commented Dec 20, 2024

jpeponis commented Jan 11, 2025

vikrantSinghOnGithub commented Jan 18, 2025

louis-thevenet commented Jan 18, 2025

DMontgomery40 commented Feb 5, 2025 • edited Loading

awaLiny2333 commented Feb 5, 2025

GTMoraes commented Feb 6, 2025

ofoacimr commented Feb 14, 2025

dav3shanahan commented Feb 14, 2025 via email

DMontgomery40 commented Feb 14, 2025

dav3shanahan commented Feb 14, 2025

DMontgomery40 commented Feb 14, 2025

oising commented Feb 17, 2025

AleksJev commented Feb 26, 2025

nebulakid commented Mar 19, 2025

grabar-prog commented Mar 26, 2025

DMontgomery40 commented Feb 5, 2025 •

edited

Loading