Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Manage dependencies and add missing einops req
#1859 opened Jun 7, 2025 by ksivaman Loading…
7 of 13 tasks
[PyTorch] Add support for FP8 current scaling in operation-based API enhancement New feature or request testing Improvements to tests or testing infrastructure
#1858 opened Jun 6, 2025 by timmoon10 Loading…
6 of 14 tasks
[JAX] GEMM custom op 2.5.0
#1855 opened Jun 6, 2025 by denera Loading…
6 of 13 tasks
pyproject.toml 2.5.0
#1852 opened Jun 5, 2025 by ksivaman Loading…
4 of 13 tasks
Draft: Add support for overlapping wgrad NCCL AG with dgrad GEMM
#1849 opened Jun 4, 2025 by djns99 Loading…
4 of 13 tasks
[PyTorch] Inference mode disables initializing quantized weights with column-wise usage 2.5.0 bug Something isn't working enhancement New feature or request
#1847 opened Jun 4, 2025 by timmoon10 Loading…
6 of 13 tasks
[JAX] TensorUsage + FP8 GEMM with all layouts handling on BW 2.5.0
#1844 opened Jun 3, 2025 by phu0ngng Loading…
8 of 13 tasks
[PyTorch Debug] Fixed the empty tensor bug in statistics computation
#1843 opened Jun 3, 2025 by pggPL Loading…
8 of 13 tasks
TE Gemma tutorial attempt#2
#1839 opened Jun 2, 2025 by sudhakarsingh27 Draft
1 task done
Make quantize_ respect the usages of the quantizer
#1836 opened May 31, 2025 by ptrendx Loading…
13 tasks
[PyTorch] Use FP16 tols for distributed tests with TF32 compute
#1831 opened May 28, 2025 by timmoon10 Loading…
6 of 13 tasks
Add cuBLASMp-backed GEMM-like API to TE common
#1824 opened May 27, 2025 by mk-61 Loading…
4 of 13 tasks
Add support for head_dim > 128 2.5.0
#1797 opened May 18, 2025 by cyanguwa Loading…
9 of 13 tasks
[PyTorch][MoE] Reduce CPU Overhead By Fuse Torch Empty Calls performance Performance issues
#1793 opened May 16, 2025 by zhongbozhu Loading…
1 of 13 tasks
[common] Added support of FP4 data type
#1779 opened May 13, 2025 by Oleg-Goncharov Loading…
6 of 13 tasks
[PyTorch] Update PyTorch FSDP2 test to cover all TE layer types testing Improvements to tests or testing infrastructure
#1777 opened May 12, 2025 by denera Loading…
8 of 13 tasks
[PyTorch] Draft of new activation offloading API
#1762 opened May 8, 2025 by pggPL Draft
13 tasks
cache sequence chunk ids for reordering
#1757 opened May 7, 2025 by xrennvidia Draft
13 tasks
Zr te doc edits
#1745 opened May 2, 2025 by zredeaux07 Loading…
12 tasks
[PyTorch] Refactor activation offloading of quantized tensors.
#1738 opened Apr 30, 2025 by pggPL Loading…
8 of 13 tasks
ProTip! Updated in the last three days: updated:>2025-06-06.