Skip to content

v0.8.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@ssube ssube released this 11 Mar 04:53
· 1291 commits to main since this release
v0.8.0
a195bc1

Features

This is the largest release yet, on both the client and server:

  • add initial support for LoRA weights and Textual Inversions
    • Textual Inversions can be selected in the client (#179)
    • LoRA weights must be blended with their base model for now (#157)
  • add support for localization to the client (#127)
    • using i18next, should detect browser locale
    • partially translated into French, German, and Spanish so far
    • works with user models in the extras file (#144)
  • core rewrite of the device worker pool to manage memory leaks (#162, #170)
    • dedicated worker process per device with memory and error isolation
    • restart workers on regular intervals and after memory allocation errors
  • add a parameter for image batch size (#195)
    • seems to support 4-5 images on a 24GB GPU and 3 images on 16GB
    • available for txt2img and img2img tabs
  • add prompt to upscaling tab (#187)
  • add UniPC multistep scheduler
  • add eta parameter for DDIM scheduler (#194)
  • add option to run face correction before or after upscaling (or both, #132)
  • ONNX acceleration for Real ESRGAN v3 (#113)
  • add support for attention slicing, CPU offload, and other optimizations (#155)
  • add an option to turn off progress bars in server logs (#158)
  • fix inpainting for images < 512 (#172)
  • add a loading screen while connecting to the server
  • add a warning in the client when inpainting with a regular model (#54)
  • add a VAE parameter when converting extra models (#145)

Device Workers

The device worker pool, which manages the background workers used to generate images, has been completely rewritten to help manage some fairly severe memory leaks in the ONNX runtime. Each worker should keep its own cache of models that have been uploaded to VRAM, and workers will be recycled after 10 jobs or when they encounter a memory allocation error.

This is making the model cache less effective, which I hope to fix in a future patch, but the previous method was consistently running out of memory after 95-100 images. This one has been tested past 1000.

Localization

The client now supports localization, using the excellent i18next project, and should detect your browser's locale. There are initial machine translations into French, German, and Spanish. You can set the translation for custom models and Inversions in your extras file.

Models and Parameters

This release also completes ONNX acceleration for the Real ESRGAN family of models and adds some missing parameters to the diffusion pipelines, including image batch size and DDIM eta. Since memory consumption is somewhat higher with ONNX, it seems like 3-4 images is the maximum batch size for most commonly-available cards.

Artifacts

Release checklist: #217
Release milestone: https://github.com/ssube/onnx-web/milestone/7?closed=1
Release pipeline: https://git.apextoaster.com/ssube/onnx-web/-/pipelines/49361