Enhancing Ascend 910A Training Efficiency in LlamaFactory with NPU #3584

zhou-wjjw · 2024-05-06T05:48:52Z

What does this PR do?

The training efficiency of the Ascend 910A has been significantly enhanced by leveraging the full computational power of the NPU and the capabilities of torch_npu. This enhancement has led to a remarkable tenfold increase in training efficiency.

…anced, leveraging the full computational power of the NPU (Neural Processing Unit) and the capabilities of torch_npu, a PyTorch library optimized for NPUs. This improvement has resulted in a remarkable tenfold increase in efficiency.

hiyouga · 2024-05-06T16:10:05Z

We should first check if the torch_npu package is available, such as

LLaMA-Factory/src/llmtuner/chat/vllm_engine.py

Lines 11 to 14 in d6ca785

 if is_vllm_available(): 

 from vllm import AsyncEngineArgs, AsyncLLMEngine, RequestOutput, SamplingParams 

 from vllm.lora.request import LoRARequest 

 from vllm.sequence import MultiModalData

zhou-wjjw · 2024-05-07T05:50:18Z

We should first check if the torch_npu package is available, such as

LLaMA-Factory/src/llmtuner/chat/vllm_engine.py

Lines 11 to 14 in d6ca785

if is_vllm_available():

from vllm import AsyncEngineArgs, AsyncLLMEngine, RequestOutput, SamplingParams

from vllm.lora.request import LoRARequest

from vllm.sequence import MultiModalData

Alright, so it looks like VLLM doesn't support Ascend. No worries, I'll just tweak the code a bit and see if I can get it working.

zhou-wjjw · 2024-05-07T06:07:50Z

We should first check if the torch_npu package is available, such as

LLaMA-Factory/src/llmtuner/chat/vllm_engine.py

Lines 11 to 14 in d6ca785

if is_vllm_available():

from vllm import AsyncEngineArgs, AsyncLLMEngine, RequestOutput, SamplingParams

from vllm.lora.request import LoRARequest

from vllm.sequence import MultiModalData

Yeah, I think updating the docs for now and letting the devs figure out how to handle it could be a good way to go. Just make it clear in the documentation that VLLM and Ascend aren't playing nicely together at the moment. That way, the developers can see the issue straight away and decide for themselves how they want to tackle it, based on what works best for their specific project.

statelesshz

Hmm... If you want to use LLaMA-Factory on Ascend910A, this is the modification method I would recommend.

src/train.py

Co-authored-by: Huazhong Ji <[email protected]>

hiyouga · 2024-05-14T14:18:34Z

It now works, LGTM

hiyouga added the pending This problem is yet to be addressed. label May 6, 2024

statelesshz reviewed May 14, 2024

View reviewed changes

src/train.py Outdated Show resolved Hide resolved

src/train.py Outdated Show resolved Hide resolved

src/train.py Show resolved Hide resolved

src/train.py Outdated Show resolved Hide resolved

src/train.py Outdated Show resolved Hide resolved

hiyouga and others added 3 commits May 14, 2024 20:44

Apply suggestions from code review

0ac6e73

Co-authored-by: Huazhong Ji <[email protected]>

Apply suggestions from code review

9089bc7

Co-authored-by: Huazhong Ji <[email protected]>

Update train.py

1c3c498

hiyouga requested review from hiyouga and statelesshz May 14, 2024 12:49

hiyouga approved these changes May 14, 2024

View reviewed changes

hiyouga merged commit ee4752f into hiyouga:main May 14, 2024
1 check passed

hiyouga added solved This problem has been already solved. and removed pending This problem is yet to be addressed. labels May 14, 2024

hiyouga removed the request for review from statelesshz May 14, 2024 16:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancing Ascend 910A Training Efficiency in LlamaFactory with NPU #3584

Enhancing Ascend 910A Training Efficiency in LlamaFactory with NPU #3584

zhou-wjjw commented May 6, 2024

hiyouga commented May 6, 2024 •

edited

zhou-wjjw commented May 7, 2024

zhou-wjjw commented May 7, 2024

statelesshz left a comment

hiyouga commented May 14, 2024

Enhancing Ascend 910A Training Efficiency in LlamaFactory with NPU #3584

Enhancing Ascend 910A Training Efficiency in LlamaFactory with NPU #3584

Conversation

zhou-wjjw commented May 6, 2024

What does this PR do?

hiyouga commented May 6, 2024 • edited

zhou-wjjw commented May 7, 2024

zhou-wjjw commented May 7, 2024

statelesshz left a comment

Choose a reason for hiding this comment

hiyouga commented May 14, 2024

hiyouga commented May 6, 2024 •

edited