v0.8.0
Pre-releaseFeatures
This is the largest release yet, on both the client and server:
- add initial support for LoRA weights and Textual Inversions
- add support for localization to the client (#127)
- using i18next, should detect browser locale
- partially translated into French, German, and Spanish so far
- works with user models in the extras file (#144)
- core rewrite of the device worker pool to manage memory leaks (#162, #170)
- dedicated worker process per device with memory and error isolation
- restart workers on regular intervals and after memory allocation errors
- add a parameter for image batch size (#195)
- seems to support 4-5 images on a 24GB GPU and 3 images on 16GB
- available for txt2img and img2img tabs
- add prompt to upscaling tab (#187)
- add UniPC multistep scheduler
- add eta parameter for DDIM scheduler (#194)
- add option to run face correction before or after upscaling (or both, #132)
- ONNX acceleration for Real ESRGAN v3 (#113)
- add support for attention slicing, CPU offload, and other optimizations (#155)
- add an option to turn off progress bars in server logs (#158)
- fix inpainting for images < 512 (#172)
- add a loading screen while connecting to the server
- add a warning in the client when inpainting with a regular model (#54)
- add a VAE parameter when converting extra models (#145)
Device Workers
The device worker pool, which manages the background workers used to generate images, has been completely rewritten to help manage some fairly severe memory leaks in the ONNX runtime. Each worker should keep its own cache of models that have been uploaded to VRAM, and workers will be recycled after 10 jobs or when they encounter a memory allocation error.
This is making the model cache less effective, which I hope to fix in a future patch, but the previous method was consistently running out of memory after 95-100 images. This one has been tested past 1000.
Localization
The client now supports localization, using the excellent i18next project, and should detect your browser's locale. There are initial machine translations into French, German, and Spanish. You can set the translation for custom models and Inversions in your extras file.
Models and Parameters
This release also completes ONNX acceleration for the Real ESRGAN family of models and adds some missing parameters to the diffusion pipelines, including image batch size and DDIM eta. Since memory consumption is somewhat higher with ONNX, it seems like 3-4 images is the maximum batch size for most commonly-available cards.
Artifacts
- https://ssube.github.io/onnx-web/v0.8.0/index.html
- https://hub.docker.com/repository/docker/ssube/onnx-web-api
podman pull docker.io/ssube/onnx-web-api:v0.8.0-cpu-buster
podman pull docker.io/ssube/onnx-web-api:v0.8.0-cuda-ubuntu
podman pull docker.io/ssube/onnx-web-api:v0.8.0-rocm-ubuntu
- https://hub.docker.com/repository/docker/ssube/onnx-web-gui
podman pull docker.io/ssube/onnx-web-gui:v0.8.0-nginx-alpine
podman pull docker.io/ssube/onnx-web-gui:v0.8.0-nginx-bullseye
podman pull docker.io/ssube/onnx-web-gui:v0.8.0-node-alpine
podman pull docker.io/ssube/onnx-web-gui:v0.8.0-node-bullseye
- https://www.npmjs.com/package/@apextoaster/onnx-web
yarn add @apextoaster/[email protected]
- https://pypi.org/project/onnx-web/
pip install onnx-web==0.8.0
Release checklist: #217
Release milestone: https://github.com/ssube/onnx-web/milestone/7?closed=1
Release pipeline: https://git.apextoaster.com/ssube/onnx-web/-/pipelines/49361