Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

enable fp32 softmax for flat_pa_mla
#243 opened Jun 27, 2025 by yangulei Loading…
Update dependabot.yml
#242 opened Jun 26, 2025 by michalkuligowski Loading…
Update linear.py
#239 opened Jun 25, 2025 by michalkuligowski Loading…
Integrating block_softmax
#238 opened Jun 24, 2025 by ksmusz Draft
Remove double generate
#229 opened Jun 18, 2025 by adobrzyn Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Loading…
Bucketing refactoring
#223 opened Jun 12, 2025 by adobrzyn Loading…
Find bucket with bmin not divs by step
#212 opened Jun 5, 2025 by adobrzyn Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Draft
fix the issue that bmax not in bucket buffer
#191 opened May 22, 2025 by sywangyi Loading…
Unify FusedMoe with expert parallelism
#175 opened May 14, 2025 by mengniwang95 Loading…
Optimized MoE on Gaudi
#159 opened Apr 18, 2025 by gyou2021 Draft
[FIX] fp8 gc compile error
#110 opened Mar 4, 2025 by maktukmak Draft
ProTip! Follow long discussions with comments:>50.