You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I had some questions while reading ZeroBubbleVPipeScheduler, especially schedule W. (As the title)
Can you plz explain exactly how this part is implemented ?
In detailed:
Where it strips the weight grad ?
Where is the weight grad calculated? (in schedule B, backward_step ?)
The text was updated successfully, but these errors were encountered:
duanjunwen
changed the title
[QUESTION] In ZeroBubbleVPipeScheduler, where is the calculation and communication of the weight grad done?
[QUESTION] Question about ZeroBubbleVPipeScheduler schedule W
Aug 1, 2024
Hello,

I had some questions while reading ZeroBubbleVPipeScheduler, especially schedule W. (As the title)
Can you plz explain exactly how this part is implemented ?
In detailed:
The text was updated successfully, but these errors were encountered: