-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CI] Split Split Distributed Tests (4 GPUs), Entrypoints and Kernel MoE tests
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#37100
opened Mar 15, 2026 by
avinashsingh77
Loading…
3 tasks
Unwrap fused_moe (control via env var) for non-DP-EP cases
#37099
opened Mar 15, 2026 by
SouthWest7
•
Draft
1 of 5 tasks
Fix Hermes streaming for parameterless tool args
tool-calling
#37098
opened Mar 15, 2026 by
Shaunak00
Loading…
Use contiguous arrays for request token histories
v1
#37097
opened Mar 15, 2026 by
Shaunak00
Loading…
[Profiling] Add optional stage-level NVTX annotations for Nsight Systems
documentation
Improvements or additions to documentation
v1
[Bugfix] Align routed_experts with streamed and final output token spans
bug
Something isn't working
v1
#37094
opened Mar 15, 2026 by
HareshKarnan
Loading…
[Frontend][Misc] Remove unused log in
/is_sleeping
frontend
#37093
opened Mar 15, 2026 by
esmeetu
Loading…
5 tasks
[Bugfix] Move GDN warmup after KV cache allocation to fix memory leak (#36973)
bug
Something isn't working
qwen
Related to Qwen models
#37088
opened Mar 15, 2026 by
haosdent
Loading…
feat: Standardize load_weights API via AutoWeightsLoader
documentation
Improvements or additions to documentation
llama
Related to Llama models
#37085
opened Mar 15, 2026 by
akh64bit
Loading…
[Misc] Add unit tests for min_p Triton sampling kernel
v1
#37083
opened Mar 15, 2026 by
gkuwanto
Loading…
3 of 5 tasks
[V1, V2] Add temperature for prompt logprobs
frontend
v1
#37082
opened Mar 15, 2026 by
JacobHelwig
Loading…
3 of 5 tasks
Add Mistral Guidance
frontend
structured-output
v1
#37081
opened Mar 14, 2026 by
juliendenize
Loading…
3 tasks done
[Feature][Cleanup]: Optimize token data structures (list[int] to TokenArray)
kv-connector
structured-output
tpu
Related to Google TPUs
v1
#37078
opened Mar 14, 2026 by
akh64bit
Loading…
[Feature][Frontend] add support for Cohere Embed v2 API
documentation
Improvements or additions to documentation
frontend
#37074
opened Mar 14, 2026 by
walterbm
Loading…
4 of 5 tasks
[Bugfix] Fix harmony streaming tool call crash and argument splitting
bug
Something isn't working
frontend
gpt-oss
Related to GPT-OSS models
#37070
opened Mar 14, 2026 by
Pradyun92
Loading…
3 of 5 tasks
[Bugfix] Add Qwen3.5 MoE support to benchmark_moe.py
bug
Something isn't working
performance
Performance-related issues
qwen
Related to Qwen models
#37068
opened Mar 14, 2026 by
iphands
Loading…
Improve CPU platform detection fallback for source checkouts
#37067
opened Mar 14, 2026 by
ezylopx5
Loading…
[Distributed] Add OfflineState bloom-filter cooperative caching KV connector
documentation
Improvements or additions to documentation
kv-connector
performance
Performance-related issues
#37066
opened Mar 14, 2026 by
Ilank1
Loading…
4 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.