-
Notifications
You must be signed in to change notification settings - Fork 111
merged_prefill for V1 #1342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merged_prefill for V1 #1342
Conversation
/run-gaudi-tests |
4 similar comments
/run-gaudi-tests |
/run-gaudi-tests |
/run-gaudi-tests |
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
d4933e9
to
c663fe6
Compare
Signed-off-by: Michal Adamczyk <[email protected]>
Signed-off-by: Michal Adamczyk <[email protected]>
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
…v1_merged_prefill Signed-off-by: Michal Adamczyk <[email protected]>
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
…v1_merged_prefill
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
…v1_merged_prefill
/run-gaudi-tests |
…v1_merged_prefill Signed-off-by: Michal Adamczyk <[email protected]>
Signed-off-by: Michal Adamczyk <[email protected]>
…v1_merged_prefill
/run-gaudi-tests |
Signed-off-by: Michal Adamczyk <[email protected]>
Signed-off-by: Michal Adamczyk <[email protected]>
/run-gaudi-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces experimental support for merged_prefill in V1 and refactors prompt batch creation in preparation for prompt_chunking work with APC support.
- Replaces environment variable usage with get_config().merged_prefill in the HPU model runner
- Updates prefill metadata function signatures and default parameters in HPU attention backend
- Updates the vllm-hpu-extension dependency and adds new Jenkins test configurations for merged prefill
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
vllm/worker/hpu_model_runner.py | Updated to use get_config().merged_prefill instead of environment variables for merged prefill configuration |
vllm/v1/attention/backends/hpu_attn.py | Refactored prefill metadata functions to remove now-ignored parameters and to update their API for HPU V1 |
requirements/hpu.txt | Updated commit pointer for vllm-hpu-extension dependency |
.jenkins/test_config.yaml | Added new test jobs for merged prefill functionality |
/run-gaudi-tests |
Notable changes [V1]:
extension PR: HabanaAI/vllm-hpu-extension#226