Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[CI] Split Split Distributed Tests (4 GPUs), Entrypoints and Kernel MoE tests ci/build ready ONLY add when PR is ready to merge/full CI is needed
#37100 opened Mar 15, 2026 by avinashsingh77 Loading…
3 tasks
[Profiling] Add optional stage-level NVTX annotations for Nsight Systems documentation Improvements or additions to documentation v1
#37095 opened Mar 15, 2026 by KOKOSde Draft
[Bugfix] Align routed_experts with streamed and final output token spans bug Something isn't working v1
#37094 opened Mar 15, 2026 by HareshKarnan Loading…
[Frontend][Misc] Remove unused log in /is_sleeping frontend
#37093 opened Mar 15, 2026 by esmeetu Loading…
5 tasks
[Bugfix] Move GDN warmup after KV cache allocation to fix memory leak (#36973) bug Something isn't working qwen Related to Qwen models
#37088 opened Mar 15, 2026 by haosdent Loading…
feat: Standardize load_weights API via AutoWeightsLoader documentation Improvements or additions to documentation llama Related to Llama models
#37085 opened Mar 15, 2026 by akh64bit Loading…
[Misc] Add unit tests for min_p Triton sampling kernel v1
#37083 opened Mar 15, 2026 by gkuwanto Loading…
3 of 5 tasks
[V1, V2] Add temperature for prompt logprobs frontend v1
#37082 opened Mar 15, 2026 by JacobHelwig Loading…
3 of 5 tasks
Add Mistral Guidance frontend structured-output v1
#37081 opened Mar 14, 2026 by juliendenize Loading…
3 tasks done
[Feature][Frontend] add support for Cohere Embed v2 API documentation Improvements or additions to documentation frontend
#37074 opened Mar 14, 2026 by walterbm Loading…
4 of 5 tasks
[Bugfix] Fix harmony parser crash on terminal tokens after end-of-message bug Something isn't working frontend gpt-oss Related to GPT-OSS models
#37072 opened Mar 14, 2026 by Pradyun92 Loading…
3 of 5 tasks
[Bugfix] Fix Responses API harmony streaming: token splitting, missing done events, nested sequence_number bug Something isn't working frontend gpt-oss Related to GPT-OSS models
#37071 opened Mar 14, 2026 by Pradyun92 Loading…
3 of 5 tasks
[Bugfix] Fix harmony streaming tool call crash and argument splitting bug Something isn't working frontend gpt-oss Related to GPT-OSS models
#37070 opened Mar 14, 2026 by Pradyun92 Loading…
3 of 5 tasks
[Bugfix] Add Qwen3.5 MoE support to benchmark_moe.py bug Something isn't working performance Performance-related issues qwen Related to Qwen models
#37068 opened Mar 14, 2026 by iphands Loading…
[Distributed] Add OfflineState bloom-filter cooperative caching KV connector documentation Improvements or additions to documentation kv-connector performance Performance-related issues
#37066 opened Mar 14, 2026 by Ilank1 Loading…
4 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.