Question about how to use batches of images in ControlNet+ip-Adapter to generate images separately without the influence of the condition images in one batch #7933

xingyouxin · 2024-05-13T13:19:09Z

xingyouxin
May 13, 2024

Hello everyone, I am using ControlNet+ip-Adapter to generate images about materials (computer graphics, rendering). While trying to generate a material image with conditions of an adapter image and a Control-Net image, it was very successful. However, when turning to batch images, I found that the results were affected by all condition images in one batch.

Thanks for everyone's help!

The codes I have tested are as follows:

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch
from diffusers.utils import load_image
from PIL import Image

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows * cols
    w, h = imgs[0].size
    grid = Image.new("RGB", size=(cols * w, rows * h))
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i % cols * w, i // cols * h))
    return grid

controlnet_model_path = r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\outputs-controlnet-256-15mul300"
controlnet = ControlNetModel.from_pretrained(controlnet_model_path, torch_dtype=torch.float16)

pipeline = StableDiffusionControlNetPipeline.from_pretrained(
    r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\models--runwayml--stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16)
pipeline.to("cuda")

print("Loading control images...")
urls = "0_0_sampled.png", "0_1_sampled.png", "0_2_sampled.png", "0_3_sampled.png"
imgs_condition = [
    load_image(r"F:\XYX\Documents\SpongeCakeInverse\AITools\diffusers\Diffusers_IPAdapter-main\assets\\" + url)
    for url in urls
]

urls = "9-19_IMG_0190.png", "9-19_IMG_0174256.png", "00020-2589011554.png", "IMG_9136.png"
imgs_adapter = [
    load_image(r"F:\XYX\Documents\SpongeCakeInverse\AITools\diffusers\Diffusers_IPAdapter-main\assets\\" + url)
    for url in urls
]

pipeline.load_ip_adapter([r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\models--ip-adapter\image_encoder_sd15_path"]*4, subfolder=["models"]*4,
                         weight_name=[r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\models--ip-adapter\ipadapter_sd15_plus_path\ip-adapter_sd15.bin"]*4)

generator = [torch.Generator().manual_seed(i) for i in range(4)]

prompt = [t for t in ["", "", "", ""]]
output = pipeline(
    prompt=prompt,
    image=imgs_condition,
    ip_adapter_image=imgs_adapter,
    negative_prompt=prompt,
    num_inference_steps=20,
    generator=generator,
)

grid_conditions = image_grid(imgs_condition, 2, 2)
grid_adapters = image_grid(imgs_adapter, 2, 2)
grid_results = image_grid(output.images, 2, 2)

grid_conditions.save('./output/ip-Adapter2/test-conditions.png')
grid_adapters.save('./output/ip-Adapter2/test-adapters.png')
grid_results.save('./output/ip-Adapter2/test-results.png')

Answered by xingyouxin

May 14, 2024

It seems that loading multiple ip-adapters together to deal with a batch in the above way is designed to affect the generated results uniformly. I have changed my mind and wrote the following codes to figure out my problem.

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, DDIMScheduler
from diffusers.utils import load_image
import torch
from PIL import Image

from ip_adapter.ip_adapter import IPAdapter

device = "cuda"

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows * cols

    w, h = imgs[0].size
    grid = Image.new("RGB", size=(cols * w, rows * h))
    grid_w, grid_h = grid.size

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i …

View full answer

xingyouxin · 2024-05-14T02:50:14Z

xingyouxin
May 14, 2024
Author

It seems that loading multiple ip-adapters together to deal with a batch in the above way is designed to affect the generated results uniformly. I have changed my mind and wrote the following codes to figure out my problem.

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, DDIMScheduler
from diffusers.utils import load_image
import torch
from PIL import Image

from ip_adapter.ip_adapter import IPAdapter

device = "cuda"

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows * cols

    w, h = imgs[0].size
    grid = Image.new("RGB", size=(cols * w, rows * h))
    grid_w, grid_h = grid.size

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i % cols * w, i // cols * h))
    return grid

print("Loading control images...")
urls = "0_0_sampled.png", "0_1_sampled.png", "0_2_sampled.png", "0_3_sampled.png"
imgs_condition = [
    load_image(r"F:\XYX\Documents\SpongeCakeInverse\AITools\diffusers\Diffusers_IPAdapter-main\assets\\" + url)
    for url in urls
]

urls = "9-19_IMG_0190.png", "9-19_IMG_0174256.png", "00020-2589011554.png", "IMG_9136.png"
imgs_adapter = [
    load_image(r"F:\XYX\Documents\SpongeCakeInverse\AITools\diffusers\Diffusers_IPAdapter-main\assets\\" + url)
    for url in urls
]

print('Loading models...')
controlnet = ControlNetModel.from_pretrained(r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\outputs-controlnet-256-15mul300", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
        r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\models--runwayml--stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16,
        safety_checker=None, requires_safety_checker=False
)
pipe.safety_checker = None
pipe.feature_extractor = None
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
pipe.to(device)

adapter_img = Image.open("assets/9-19_IMG_0190.png")
controlnet_condition = Image.open("assets/0_0_sampled.png")

ip_adapter = IPAdapter(pipe, r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\models--ip-adapter\ipadapter_sd15_plus_path\ip-adapter_sd15.bin",
                       r"F:\XYX\Documents\SpongeCakeInverse\AITools\ControlNetImplement\ControlNetTest\models--ip-adapter\image_encoder_sd15_path\models\image_encoder", device=device)

print('Imlementing...')
generator = torch.Generator().manual_seed(0)
# generator = [torch.Generator().manual_seed(i) for i in range(len(imgs_adapter))]

prompt = [t for t in ["", "", "", ""]]

prompt_embeds_lists = []
negative_prompt_embeds_lists = []
for i in range(len(prompt)):
    prompt_embeds_i, negative_prompt_embeds_i = ip_adapter.get_prompt_embeds(
        imgs_adapter[i],
        prompt=prompt[i],
        negative_prompt=prompt[i],
    )
    prompt_embeds_lists.append(prompt_embeds_i)
    negative_prompt_embeds_lists.append(negative_prompt_embeds_i)

prompt_embeds = torch.cat(prompt_embeds_lists, dim=0)
negative_prompt_embeds = torch.cat(negative_prompt_embeds_lists, dim=0)

output = pipe(
    prompt_embeds=prompt_embeds,
    negative_prompt_embeds=negative_prompt_embeds,
    num_inference_steps=20,
    # guidance_scale=5.0,
    num_images_per_prompt=1,
    height=256,
    width=256,
    generator=generator,
    image=controlnet_condition,
)

grid_conditions = image_grid(imgs_condition, 2, 2)
grid_adapters = image_grid(imgs_adapter, 2, 2)
grid_results = image_grid(output.images, 2, 2)

grid_conditions.save('./output/ip-adapter/test-conditions-seed0.png')
grid_adapters.save('./output/ip-adapter/test-adapters-seed0.png')
grid_results.save('./output/ip-adapter/test-results-seed0.png')

The differences between the two paragraphs of code are mainly in building the prompt_embeds. I find that separately generating the prompt_embeds and cating them up work out.

Hope my experience can help you. Or if there is another solution, please write and tell me. Thank you guys!

0 replies

asomoza · 2024-05-14T04:21:55Z

asomoza
May 14, 2024
Maintainer

I want to understand a bit more of your problem, I'll do a simplified code of what you're doing and without list comprehension:

controlnet = ControlNetModel.from_pretrained(...)
pipeline = StableDiffusionControlNetPipeline.from_pretrained(...).to("cuda")

pipeline.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="models",
    weight_name=["ip-adapter_sd15.bin"] * 4,
)

imgs_condition = ["control_image_1.png", "control_image_2.png", "control_image_3.png", "control_image_4.png"]
imgs_adapter = ["ip_image_1.png", "ip_image_2.png", "ip_image_3.png", "ip_image_4.png"]
prompt =  [""]*4

output = pipeline(
    prompt=prompt,
    image=imgs_condition,
    ip_adapter_image=imgs_adapter,
    negative_prompt=prompt,
    num_inference_steps=20,
    generator=generator,
)

So the problem here is that you want to generate a batch of 4 images, each with a different control image and ip image but the IP Adapter is treating the list of images as a multiple image ip adapter instead of a separate condition for each batch image.

I tested this and IMO this is a bug, but I don't really use batch generations and don't know which is the real use case for this. @yiyixuxu @fabiorigano WDYT?

0 replies

dribnet · 2024-08-26T09:37:24Z

dribnet
Aug 26, 2024

This bug bit me today too.

Expected:

Got:

MultiDiffusion generates a batch of images in parallel with different prompts and then blends the latents together. Extending this to ip_adapters is a natural fit, but unexpectedly discovered these ip_adapters are currently all combined inside the batch.

0 replies

dribnet · 2024-08-26T15:23:13Z

dribnet
Aug 26, 2024

Here's a first stab at what a fix might look like. For me this commit results in each ip_adapter getting run once when the number of ip adpaters matches the batch size. The is working for me fine now.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about how to use batches of images in ControlNet+ip-Adapter to generate images separately without the influence of the condition images in one batch #7933

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Question about how to use batches of images in ControlNet+ip-Adapter to generate images separately without the influence of the condition images in one batch #7933

Uh oh!

xingyouxin May 13, 2024

Replies: 4 comments

Uh oh!

xingyouxin May 14, 2024 Author

Uh oh!

asomoza May 14, 2024 Maintainer

Uh oh!

dribnet Aug 26, 2024

Uh oh!

Uh oh!

dribnet Aug 26, 2024

xingyouxin
May 13, 2024

xingyouxin
May 14, 2024
Author

asomoza
May 14, 2024
Maintainer

dribnet
Aug 26, 2024

dribnet
Aug 26, 2024