Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[PoC] Add max padding ratio to padding aware scheduler habana Issues or PRs submitted by Habana Labs
#407 opened Oct 18, 2024 by kzawora-intel Draft
Add models-tiny CI step with Llama3.2-1B habana Issues or PRs submitted by Habana Labs
#440 opened Oct 28, 2024 by kzawora-intel Draft
[WIP] Add HPU support to vLLM v1
#487 opened Nov 12, 2024 by kzawora-intel Draft
19 of 23 tasks
Add in Dockerfile.hpu.ubi external Issues or PRs submitted by external users
#602 opened Dec 9, 2024 by Xaenalt Loading…
[WIP] Add HPU support to vLLM v1 - cont. stale
#609 opened Dec 10, 2024 by kzawora-intel Loading…
21 of 23 tasks
Add exponential bucketing integration
#642 opened Dec 17, 2024 by kzawora-intel Loading…
Chunked Prefill
#656 opened Dec 20, 2024 by hlahkar Draft
Enabled and optimized GLM-4v-9b on Gaudi New Model Issue o PR to enable a new model
#691 opened Jan 16, 2025 by gyou2021 Loading…
Enable roberta embedding
#786 opened Feb 5, 2025 by yeonsily Loading…
Support qwenvl model for HPU New Model Issue o PR to enable a new model
#793 opened Feb 7, 2025 by yingjie-han Loading…
Resolve Speculative Decode RTE
#823 opened Feb 13, 2025 by tannervoas742 Loading…
[CI] Add APC tests
#866 opened Feb 25, 2025 by kzawora-intel Loading…
Bump jinja2 from 3.1.4 to 3.1.6 dependencies Pull requests that update a dependency file python Pull requests that update python code
#891 opened Mar 6, 2025 by dependabot bot Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.