Skip to content

v1.2.4 - The Prodigal child returns

Latest
Compare
Choose a tag to compare
@bghira bghira released this 22 Jan 23:33
· 8 commits to main since this release
46708c1

image
Stable Diffusion 3.5 Medium fine-tuned on v1.2.4

Features

Ignore final epochs for changes in dataloader length

  • Use --ignore_final_epochs=true to disable tracking of epochs so that your max train steps value is reached.
    • This is helpful if you remove or add substantial amount of data from your training set.
    • Remember to use --max_train_steps instead of --num_epochs when using this option.

New experimental Prodigy optimiser

  • Thanks to @LoganBooker we now have a new implementation of Prodigy that supports stochastic rounding and other features needed to reintroduce support.
    • You may want to adjust --optimizer_config=d_coef=1 to a lower value to make the ramp-up and max LR lower.
    • Changing LR is not currently tested/supported.

Bugfixes

Image preprocessing

  • VAE cache elements were being cropped too far for square input images that were larger than the target resolution.
    • Example: An input of 1024x1024 with resolution of 512 and resolution_type=pixel_area or =area would be overly cropped straight from 1024px to 512px
    • Not impacted: An input of 1024x1024 with a resolution of 1024 and any resolution_type worked as intended
    • Not impacted: An input of 1024px and resolution of 512 with resolution_type=pixel

You'll want to recreate VAE caches and dataset metadata for this bugfix.

To remove the metadata, find and delete the *.json files from your image directories.

Sana

  • Fix modeling code after PEFT LoRA addition broke compatibility

AMD ROCm

  • Update list of BNB optimisers and another minor fix for MI300+ users

Validations

  • Fix that the validations were not running for the model final export at the end of training

What's Changed

  • instanceprompt strategy fix for caption discovery by @bghira in #1271
  • sana: fix modeling code reference to attention_kwargs by @bghira in #1273
  • Small fixes for running on AMD GPUs by @rkarhila-amd in #1276
  • add a special case for square input images where we need to resize to the target as intermediary, which can be considered a safe operation by @bghira in #1281
  • catch and handle wandb error when it is disabled by @bghira in #1282
  • add debug logging for factory initialisation in multigpu systems where it seems to get stuck, and format some files by @bghira in #1283
  • add ignore_final_epochs to workaround epoch tracking oddness when changing dataloader length by @bghira in #1285
  • fix divide by zero when reducing dataloader length by @bghira in #1286
  • update options doc for --ignore_final_epochs by @bghira in #1287
  • fix check for running the final validations by @bghira in #1290
  • Fixing broken python-tests action by @diodotosml in #1293
  • add prodigy optimiser with full bf16 support by @bghira in #1294
  • update docs for optimizer args by @bghira in #1295

New Contributors

Full Changelog: v1.2.3...v1.2.4