-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Issues: NVIDIA/Megatron-LM
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[QUESTION]Does Megatron support tracing computation graphs with torch.fx?
#1315
opened Dec 7, 2024 by
fy-j
[BUG] When using LLaVA with freeze-LM, training text only sample occurs error.
#1314
opened Dec 6, 2024 by
liveseongho
[QUESTION] How to specify the implementation of Attention?
#1313
opened Dec 6, 2024 by
renyinCheng001
[QUESTION]UnboundLocalError:local variable ‘output tensor’ referenced before assignmnet
#1311
opened Dec 5, 2024 by
zmtttt
[BUG] The problem of splitting transformer layers when pipeline parallelism cannot be evenly divided.
#1304
opened Nov 27, 2024 by
Baibaifan
[QUESTION] How to split the Transform layer when the pipeline is uneven?
#1303
opened Nov 27, 2024 by
renyinCheng001
[QUESTION] Why is the initialization of the router and experts different in the MoE part?
#1302
opened Nov 27, 2024 by
mxymxy77
[BUG] an illegal memory access was encountered in MOE-MLP(GroupGemm)
#1301
opened Nov 26, 2024 by
hgdhrt
[BUG] 0.9.0 release version got param_gather_handle error with 3d parallel
#1292
opened Nov 19, 2024 by
SeunghyunSEO
[QUESTION] How to convert torch_dist format checkpoint to torch format?
#1291
opened Nov 19, 2024 by
zhangyilalala
Where can I download the tokenizer for the model mcore-llava-mistral-7b-instruct-clip336-pretraining?
#1281
opened Nov 11, 2024 by
herolxl
[QUESTION] is there any restriction to use allgather with moe_expert_capacity_factor?
#1277
opened Nov 7, 2024 by
Louis-J
[BUG] TP-comm-overlap bug when replacing
TELayerNormColumnParallelLinear
into TEColumnParallelLinear
.
#1275
opened Nov 6, 2024 by
wplf
Previous Next
ProTip!
Adding no:label will show everything without a label.