NVIDIA / TransformerEngine Public

Notifications You must be signed in to change notification settings
Fork 389
Star 2.3k

Code
Issues 173
Pull requests 64
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: NVIDIA/TransformerEngine

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

173 Open 257 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[BUG] Weight gradients with TransformerEngine v2.1 don't match those with TransformerEngine v1.12 bug

Something isn't working

tp_overlap

#1616 opened Mar 26, 2025 by okoge-kaz

[BUG] Wrong attention gradient in Transformer Engine bug

Something isn't working

#1615 opened Mar 26, 2025 by i-love-megatron

fp8 can not set pp>1

#1604 opened Mar 22, 2025 by mdy666

Can we only replace part of nn.Linear with te.Linear and others keep unchanged?

#1595 opened Mar 20, 2025 by zigzagcai

Does Transformer Engine support FP8 Grouped GEMM?

#1594 opened Mar 20, 2025 by zigzagcai

How to debug tex.fused_attn_bwd getting cuDNN Error: [cudnn_frontend] Error: No execution plans support the graph bug

Something isn't working

#1591 opened Mar 19, 2025 by Ir1d

did not get improvement from tp/sp overlap

#1587 opened Mar 18, 2025 by artetaout

cuBLAS Error

#1585 opened Mar 18, 2025 by wangli68

curious about atomic_gemm_overlap_ag

#1583 opened Mar 17, 2025 by ehion

Does TransformerEngine support FP8 communication such like all-gather or all-to-all?

#1579 opened Mar 14, 2025 by zigzagcai

Is it necessary to perform layer replacement on te.xx? If not, is it effective to use te.fp8.autocast directly

#1556 opened Mar 11, 2025 by wangli68

Install earlier version 1.13

#1553 opened Mar 9, 2025 by velocirraptor23

Context parallelism with MLA

#1552 opened Mar 8, 2025 by SuperCB

When I import the package ’transformer_engine.pytorch‘, the error message is as follows

#1541 opened Mar 6, 2025 by wangli68

How can we use te.Linear with weight parallel?

#1532 opened Mar 4, 2025 by zigzagcai

Causal mask ignored in DotProductAttention good first issue

Good for newcomers

#1524 opened Feb 28, 2025 by anthony-Neo

How can we integrate the DeepGEMM Fp8 GEMM implementation in TE's block-wise scaling？

#1509 opened Feb 26, 2025 by BolongLin

Failed to build transformer-engine

#1506 opened Feb 25, 2025 by sfwu2003

Question about the performace of GroupedLinear performance

Performance issues

#1499 opened Feb 20, 2025 by XLzed

Float8Quantizer::create_tensor calculates scale_inv instead of creating an empty buffer performance

Performance issues

#1491 opened Feb 18, 2025 by yaox12

qwen1.5-0.5B failed to save model with huggingface transformers bug

Something isn't working

#1482 opened Feb 13, 2025 by xinpengzz

When flash-attn >2.6.1, use context parallel will cause error

#1467 opened Feb 9, 2025 by south-ocean

How to set NVTE_FWD/BWD_LAYERNORM_SM_MARGIN?

#1459 opened Feb 5, 2025 by cailun01

If no Windows support is planned for 5090 as it wasn't for 4090, pass along to corporate to mention this in advertisement

#1436 opened Jan 29, 2025 by NeedsMoar

HF Accelerate FP8 use more gpu memory then FP16 in training LLM

#1429 opened Jan 28, 2025 by Liufeiran123

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-03-23.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly