v1.2.4 - The Prodigal child returns

Latest

Latest

bghira released this 22 Jan 23:33

· 8 commits to main since this release

46708c1

Stable Diffusion 3.5 Medium fine-tuned on v1.2.4

Features

Ignore final epochs for changes in dataloader length

Use --ignore_final_epochs=true to disable tracking of epochs so that your max train steps value is reached.
- This is helpful if you remove or add substantial amount of data from your training set.
- Remember to use --max_train_steps instead of --num_epochs when using this option.

New experimental Prodigy optimiser

Thanks to @LoganBooker we now have a new implementation of Prodigy that supports stochastic rounding and other features needed to reintroduce support.
- You may want to adjust --optimizer_config=d_coef=1 to a lower value to make the ramp-up and max LR lower.
- Changing LR is not currently tested/supported.

Bugfixes

Image preprocessing

VAE cache elements were being cropped too far for square input images that were larger than the target resolution.
- Example: An input of 1024x1024 with resolution of 512 and resolution_type=pixel_area or =area would be overly cropped straight from 1024px to 512px
- Not impacted: An input of 1024x1024 with a resolution of 1024 and any resolution_type worked as intended
- Not impacted: An input of 1024px and resolution of 512 with resolution_type=pixel

You'll want to recreate VAE caches and dataset metadata for this bugfix.

To remove the metadata, find and delete the *.json files from your image directories.

Sana

Fix modeling code after PEFT LoRA addition broke compatibility

AMD ROCm

Update list of BNB optimisers and another minor fix for MI300+ users

Validations

Fix that the validations were not running for the model final export at the end of training

What's Changed

instanceprompt strategy fix for caption discovery by @bghira in #1271
sana: fix modeling code reference to attention_kwargs by @bghira in #1273
Small fixes for running on AMD GPUs by @rkarhila-amd in #1276
add a special case for square input images where we need to resize to the target as intermediary, which can be considered a safe operation by @bghira in #1281
catch and handle wandb error when it is disabled by @bghira in #1282
add debug logging for factory initialisation in multigpu systems where it seems to get stuck, and format some files by @bghira in #1283
add ignore_final_epochs to workaround epoch tracking oddness when changing dataloader length by @bghira in #1285
fix divide by zero when reducing dataloader length by @bghira in #1286
update options doc for --ignore_final_epochs by @bghira in #1287
fix check for running the final validations by @bghira in #1290
Fixing broken python-tests action by @diodotosml in #1293
add prodigy optimiser with full bf16 support by @bghira in #1294
update docs for optimizer args by @bghira in #1295

New Contributors

@rkarhila-amd made their first contribution in #1276
@diodotosml made their first contribution in #1293

Full Changelog: v1.2.3...v1.2.4

Contributors

LoganBooker, bghira, and 2 other contributors

Assets 2