Skip to content

merged_prefill for V1 #1342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 18, 2025
Merged

merged_prefill for V1 #1342

merged 17 commits into from
Jun 18, 2025

Conversation

madamczyk-intel
Copy link

@madamczyk-intel madamczyk-intel commented May 29, 2025

Notable changes [V1]:

  • experimental support for merged_prefill in V1 with support for APC
  • refactored prompt batch creation as a prerequisite for future work on prompt_chunking

extension PR: HabanaAI/vllm-hpu-extension#226

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

4 similar comments
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

Signed-off-by: Michal Adamczyk <[email protected]>
@madamczyk-intel madamczyk-intel force-pushed the dev/madamczyk/v1_merged_prefill branch from d4933e9 to c663fe6 Compare June 5, 2025 09:58
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

Signed-off-by: Michal Adamczyk <[email protected]>
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel madamczyk-intel requested a review from Copilot June 5, 2025 11:20
Copilot

This comment was marked as outdated.

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Author

/run-gaudi-tests

@madamczyk-intel madamczyk-intel marked this pull request as ready for review June 16, 2025 08:08
Signed-off-by: Michal Adamczyk <[email protected]>
Signed-off-by: Michal Adamczyk <[email protected]>
@madamczyk-intel
Copy link
Author

/run-gaudi-tests

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces experimental support for merged_prefill in V1 and refactors prompt batch creation in preparation for prompt_chunking work with APC support.

  • Replaces environment variable usage with get_config().merged_prefill in the HPU model runner
  • Updates prefill metadata function signatures and default parameters in HPU attention backend
  • Updates the vllm-hpu-extension dependency and adds new Jenkins test configurations for merged prefill

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

File Description
vllm/worker/hpu_model_runner.py Updated to use get_config().merged_prefill instead of environment variables for merged prefill configuration
vllm/v1/attention/backends/hpu_attn.py Refactored prefill metadata functions to remove now-ignored parameters and to update their API for HPU V1
requirements/hpu.txt Updated commit pointer for vllm-hpu-extension dependency
.jenkins/test_config.yaml Added new test jobs for merged prefill functionality

@madamczyk-intel madamczyk-intel changed the title [draft] merged_prefill for V1 merged_prefill for V1 Jun 17, 2025
@adobrzyn
Copy link

/run-gaudi-tests

@adobrzyn adobrzyn merged commit eb2b279 into habana_main Jun 18, 2025
52 checks passed
@adobrzyn adobrzyn deleted the dev/madamczyk/v1_merged_prefill branch June 18, 2025 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants