This repository supports generating multi-layer transparent images (constructed with multiple RGBA image layers) based on a global text prompt and an anonymous region layout (bounding boxes without layer captions). The anonymous region layout can be either predicted by LLM or manually specified by users.
- Anonymous Layout: Requires only a single global caption to generate multiple layers, eliminating the need for individual captions for each layer.
- High Layer Capacity: Supports the generation of 50+ layers, enabling complex multi-layer outputs.
- Efficiency: Maintains high efficiency compared to full attention and spatial-temporal attention mechanisms.
This repository previously contained code and pretrained model weights for generating multi-layer transparent images using a global text prompt and anonymous region layout. However, since the model was trained using data that may have come from illegal sources, we have removed the model weights and inference checkpoints from this repository, along with all associated download links. If you have any questions, please contact the original authors through official channels.
As a result:
- The pretrained models and associated checkpoints are no longer available for download.
- No runnable or usable code for inference or training is provided.
- We do not provide any means to use or reproduce the model at this time.
- The training code, which relied on this data, has also been removed.
