You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to the inexact nature of the Pilot partitioner's memory estimation, it often underestimates the memory costs of minibatch passes. During training, the model exceeds the allocated memory bounds and errors out. Typically this occurs during the backward pass.
Quick fix: Increase double buffer space to reduce shard sizes and guarantee more free room.
Longer-term fix: Replace the Pilot Partitioner with a more exact algorithm, or one that doesn't push up on the limits of memory bounds.
The text was updated successfully, but these errors were encountered:
Problem:
Due to the inexact nature of the Pilot partitioner's memory estimation, it often underestimates the memory costs of minibatch passes. During training, the model exceeds the allocated memory bounds and errors out. Typically this occurs during the backward pass.
Quick fix: Increase double buffer space to reduce shard sizes and guarantee more free room.
Longer-term fix: Replace the Pilot Partitioner with a more exact algorithm, or one that doesn't push up on the limits of memory bounds.
The text was updated successfully, but these errors were encountered: