Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix typo
#1347 opened Jan 4, 2025 by deep-sci Loading…
Update theoretical memory footprint formula
#1345 opened Jan 3, 2025 by okoge-kaz Loading…
Fix type annotation and checkpoint conversion script
#1344 opened Jan 3, 2025 by okoge-kaz Loading…
fix bugs of data preprocessing with multiple json keys
#1337 opened Dec 25, 2024 by junjzhang Loading…
Create python-package.yml
#1332 opened Dec 21, 2024 by invisiblepancake Loading…
Add Mamba TRTLLM support
#1320 opened Dec 12, 2024 by meatybobby Loading…
update network interface env
#1319 opened Dec 12, 2024 by lizamd Loading…
[Update] Print training log in rank0
#1296 opened Nov 21, 2024 by shijungg Loading…
support qwen2 hf<->mcore ckpt converter
#1290 opened Nov 19, 2024 by wenyujin333 Loading…
Set torch.multiprocessing start method as 'spawn'
#1285 opened Nov 12, 2024 by hxdtest Loading…
Huvu/update t5 attentionmasktype stale No activity in 60 days on issue or PR
#1273 opened Nov 4, 2024 by huvunvidia Loading…
Update t5_model.py stale No activity in 60 days on issue or PR
#1271 opened Nov 2, 2024 by huvunvidia Loading…
Enable huggingface tokenizer stale No activity in 60 days on issue or PR
#1268 opened Oct 30, 2024 by msiddaiah Loading…
fix: remove unnecessary trailing comma in statement stale No activity in 60 days on issue or PR
#1265 opened Oct 29, 2024 by singleheart Loading…
[ENHANCEMENT] Add support for Apex RMSNorm for use in qk-norm stale No activity in 60 days on issue or PR
#1261 opened Oct 28, 2024 by wdevazelhes Loading…
Add support to process gzip files stale No activity in 60 days on issue or PR
#1260 opened Oct 28, 2024 by puneeshkhanna Loading…
[Wrong spelling] Update training.py stale No activity in 60 days on issue or PR
#1229 opened Oct 21, 2024 by zyqhnu Loading…
Typo fix in readme stale No activity in 60 days on issue or PR
#1223 opened Oct 17, 2024 by alexchen4ai Loading…
support qwen2 and siglip weight conversion script to enable training … stale No activity in 60 days on issue or PR
#1221 opened Oct 16, 2024 by tao-githup Loading…
readme spelling correction stale No activity in 60 days on issue or PR
#1216 opened Oct 13, 2024 by jonassteinberg1 Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.