A new method of loading the data by batch was developed and applied to the Colab notebook. This method caused the original error to signal loss calculation to blow up, so the loss function was changed to MSE. This fixed the out of memory issues by loading the data by batch_size, instead of all at once or in large chunks. However, the original error to signal loss produces more accurate models, especially on highly distorted/complex guitar signals.
Investigate using the original error to signal loss function with the custom Sequence dataloader class from the Colab notebook.
Update: It looks like the pre-emphasis filter is calculated incorrectly, as noted here:
GuitarML/PedalNetRT#16 (comment)
Apply the fix of using [t-1] and see if this fixed the instability issues.