-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Pull requests: microsoft/DeepSpeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
#6909
opened Dec 24, 2024 by
hj-wei
Loading…
Adds ignore_index to sequence parallel cross entropy
#6882
opened Dec 17, 2024 by
ronald-d-rogers
Loading…
Fix: forbid repeated deepspeed.initialize on training objects
#6874
opened Dec 16, 2024 by
traincheck-team
Loading…
[inf] Add config var to enable keeping module on host
#6846
opened Dec 10, 2024 by
oelayan7
Loading…
Add the missing view operations from sequence parallel(async).
#6750
opened Nov 14, 2024 by
inkcherry
Loading…
Training ops kernels: Speeding up the Llama-based MoE architectures
#6734
opened Nov 8, 2024 by
RezaYazdaniAminabadi
•
Draft
Allow launcher to include
--include=node3
, not just --include=node3:1,2,3,4,5,6,7,8
#6698
opened Nov 1, 2024 by
stephen-nju
Loading…
Reduce the device bubble introduced by heavy loop synchronization in coalesced fetch/release(z3_leaf_module)
#6694
opened Oct 31, 2024 by
inkcherry
Loading…
Support the parallel conversion from ZeRO checkpoints to FP32/FP16/BF16 param weight
#6655
opened Oct 23, 2024 by
xylian86
Loading…
5 tasks done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.