-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Move allreduce_strategy from committed api to reference
#5147
opened Jun 12, 2025 by
HuiGao-NV
Loading…
[https://nvbugspro.nvidia.com/bug/5329655] [feat] Pytorch path add spec dec param to attention op
#5146
opened Jun 12, 2025 by
jhaotingc
Loading…
Add Wechat_Group_QR_Code.png to docs/source/media and main page of TR…
#5142
opened Jun 12, 2025 by
AdamzNV
Loading…
[TRTLLM-5589] feat: Minor optimizations for tunable FP8 batched GEMM op.
#5139
opened Jun 12, 2025 by
hyukn
Loading…
enh(doc): Add
ci-overview
in docs/source/reference/
#5137
opened Jun 11, 2025 by
venkywonka
Loading…
[nvbug 5333996 ][fix] Unload XQA cubins early to avoid static lifetime
#5133
opened Jun 11, 2025 by
lowsfer
Loading…
Enable trtllm-bench to run LoRA and add basic e2e perf testing capability for LoRA in PyT flow
#5130
opened Jun 11, 2025 by
amitz-nv
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-06-09.