HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications
Fork 81
Star 62

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 17 Milestones 0

New pull request New

Clear current search query, filters, and sorts

61 Open 835 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[PoC] Add max padding ratio to padding aware scheduler habana

Issues or PRs submitted by Habana Labs

#407 opened Oct 18, 2024 by kzawora-intel • Draft

Add models-tiny CI step with Llama3.2-1B habana

Issues or PRs submitted by Habana Labs

#440 opened Oct 28, 2024 by kzawora-intel • Draft

[WIP] Add HPU support to vLLM v1

#487 opened Nov 12, 2024 by kzawora-intel • Draft

19 of 23 tasks

Multi models support for upstream

#590 opened Dec 4, 2024 by xuechendi • Draft

Add in Dockerfile.hpu.ubi external

Issues or PRs submitted by external users

#602 opened Dec 9, 2024 by Xaenalt

Loading…

[WIP] Add HPU support to vLLM v1 - cont. stale

#609 opened Dec 10, 2024 by kzawora-intel

Loading…

21 of 23 tasks

Add exponential bucketing integration

#642 opened Dec 17, 2024 by kzawora-intel

Loading…

Chunked Prefill

#656 opened Dec 20, 2024 by hlahkar • Draft

add renormalize param for FusedMOE

#671 opened Jan 9, 2025 by tangleintel • Draft

Enabled and optimized GLM-4v-9b on Gaudi New Model

Issue o PR to enable a new model

#691 opened Jan 16, 2025 by gyou2021

Loading…

make benchmark_throughput static support single image input

#718 opened Jan 22, 2025 by yma11 • Draft

Pipeline Parallelism implementation.

#731 opened Jan 23, 2025 by jmaksymczuk • Draft

[DO NOT MERGE][PoC] Mark dynamic shapes in torch.compile mode

#755 opened Jan 29, 2025 by kzawora-intel • Draft

Enable roberta embedding

#786 opened Feb 5, 2025 by yeonsily

Loading…

[DEEPSEEK_V3/R1] includes features of fp8 dequant, MLA, Expert parallelism

#792 opened Feb 6, 2025 by xuechendi

Loading…

Support qwenvl model for HPU New Model

Issue o PR to enable a new model

#793 opened Feb 7, 2025 by yingjie-han

Loading…

Update documentation to reflect current bucket defaults

#817 opened Feb 12, 2025 by nngokhale

Loading…

enable LoRA for embedding models

#821 opened Feb 12, 2025 by skaulintel • Draft

Resolve Speculative Decode RTE

#823 opened Feb 13, 2025 by tannervoas742

Loading…

Extend accuracy tests for models that we support

#824 opened Feb 13, 2025 by AnetaKaczynska

Loading…

enable multi-modal embedding for TIGER-Lab/VLM2Vec-Full T+I on HPU

#854 opened Feb 20, 2025 by libinta

Loading…

Update requirements-hpu.txt for open telemetry tracing support

#857 opened Feb 21, 2025 by louie-tsai

Loading…

[CI] Add APC tests

#866 opened Feb 25, 2025 by kzawora-intel

Loading…

[DO NOT MERGE] Add possibility to execute with LoRA adapter for lm_eval

#879 opened Feb 28, 2025 by mkrze • Draft

Bump jinja2 from 3.1.4 to 3.1.6 dependencies

Pull requests that update a dependency file

python

Pull requests that update python code

#891 opened Mar 6, 2025 by dependabot bot

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly