fix the ops issue when integrate vllm hpu extension to TGI #116

sywangyi · 2025-03-19T05:55:48Z

No description provided.

Signed-off-by: Wang, Yi A <[email protected]>

sywangyi · 2025-03-19T06:07:33Z

Hi, I'd like to integrate ops level optimization like flat-pa/MOE to TGI. I found some issue during integration, so I give a fix. please help review and merge. Thanks ! MOE is founding during the enabling of https://huggingface.co/microsoft/Phi-3.5-MoE-instruct

sywangyi · 2025-03-19T06:08:54Z

fix in pipelined_pa is found in TGI benchmark where only block 0 is used

sywangyi · 2025-03-19T06:23:02Z

@yao-matrix

Signed-off-by: Wang, Yi A <[email protected]>

vllm_hpu_extension/ops.py

vllm_hpu_extension/profiler.py

Signed-off-by: Wang, Yi A <[email protected]>

fmt and remove unused logger

sywangyi · 2025-04-20T02:04:19Z

@mswiniarsk done for the review comment

mswiniarsk

LGTM

sywangyi added 4 commits February 23, 2025 22:48

rm vllm dependence in ops

99cf6c3

Signed-off-by: Wang, Yi A <[email protected]>

fix model_type issue in tgi

3cd9f37

Signed-off-by: Wang, Yi A <[email protected]>

fix squeeze issue if only 0 block is used, should not squeeze the 0 dim

0d2cb89

Signed-off-by: Wang, Yi A <[email protected]>

fix phimoe issue, 16 experts

c7c9384

Signed-off-by: Wang, Yi A <[email protected]>

sywangyi requested review from kzawora-intel, madamczyk-intel, michalkuligowski, mgawarkiewicz, tzielinski-habana and afierka-intel as code owners March 19, 2025 05:55

sywangyi added 3 commits March 24, 2025 00:19

fix profiler error if vllm is not installed

7917402

Signed-off-by: Wang, Yi A <[email protected]>

Merge branch 'main' into main

eaf27cc

remove vllm dependence

304b7cc

Signed-off-by: Wang, Yi A <[email protected]>

mswiniarsk reviewed Apr 18, 2025

View reviewed changes

vllm_hpu_extension/ops.py Outdated Show resolved Hide resolved

vllm_hpu_extension/ops.py Outdated Show resolved Hide resolved

vllm_hpu_extension/profiler.py Outdated Show resolved Hide resolved

sywangyi added 2 commits April 19, 2025 18:48

fmt and remove unused logger

31cad96

Signed-off-by: Wang, Yi A <[email protected]>

Merge pull request #1 from sywangyi/fmt

3254eb9

fmt and remove unused logger

mswiniarsk approved these changes Apr 23, 2025

View reviewed changes

michalkuligowski approved these changes Apr 23, 2025

View reviewed changes

michalkuligowski merged commit 5b195ba into HabanaAI:main Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix the ops issue when integrate vllm hpu extension to TGI #116

fix the ops issue when integrate vllm hpu extension to TGI #116

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sywangyi commented Apr 20, 2025

Uh oh!

mswiniarsk left a comment

Uh oh!

Uh oh!

fix the ops issue when integrate vllm hpu extension to TGI #116

fix the ops issue when integrate vllm hpu extension to TGI #116

Uh oh!

Conversation

sywangyi commented Mar 19, 2025

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

sywangyi commented Mar 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sywangyi commented Apr 20, 2025

Uh oh!

mswiniarsk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!