Issue: OoM after multiple generations

After generating several images with FLUX on RTX 3080 10GB I am running OoM until I close and restart RuinedFooocus.

(Using flux1-dev-Q4_K_S.gguf model)

LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 2.0)]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:06<00:00,  3.31s/it]
Time taken: 70.92 seconds Pipeline process
Loading LoRAs: fluxlisimo_biomech_lora-FLUX_v1.safetensors
LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 1.5)]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [02:10<00:00,  6.52s/it]
Time taken: 138.08 seconds Pipeline process
Loading LoRAs: fluxlisimo_biomech_lora-FLUX_v1.safetensors
LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 1.2)]
 70%|█████████████████████████████████████████████████████████▍                        | 14/20 [00:46<00:19,  3.29s/it]
Time taken: 52.30 seconds Pipeline process
Traceback (most recent call last):
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\queueing.py", line 625, in process_events
    response = await route_utils.call_process_api(
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\blocks.py", line 2191, in process_api
    result = await self.call_function(
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\blocks.py", line 1688, in call_function
    processed_input, progress_index, _ = special_args(
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\helpers.py", line 968, in special_args
    inputs.insert(i, type_hint(event_data.target, event_data._data))
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\events.py", line 191, in __init__
    self.value: Any = data["value"]
KeyError: 'value'
Loading LoRAs: fluxlisimo_biomech_lora-FLUX_v1.safetensors
LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 1.0)]
Time taken: 2.41 seconds Pipeline process
Exception in thread Thread-10 (worker):
Traceback (most recent call last):
  File "threading.py", line 1016, in _bootstrap_inner
  File "threading.py", line 953, in run
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 422, in worker
    handler(task)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 409, in handler
    process(gen_data)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 366, in process
    _process(gen_data)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 249, in _process
    imgs = pipeline.process(
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\sdxl_pipeline.py", line 534, in process
    if self.textencode("+", positive_prompt, clip_skip):
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\sdxl_pipeline.py", line 468, in textencode
    self.conditions[id]["cache"] = CLIPTextEncode().encode(
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\nodes.py", line 69, in encode
    return (clip.encode_from_tokens_scheduled(tokens), )
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd.py", line 166, in encode_from_tokens_scheduled
    pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd.py", line 228, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\text_encoders\flux.py", line 53, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 45, in encode_token_weights
    o = self.encode(to_encode)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 288, in encode
    return self(tokens)
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 250, in forward
    embeds, attention_mask, num_tokens = self.process_tokens(tokens, device)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 204, in process_tokens
    tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32)
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\ops.py", line 237, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 235, in forward_comfy_cast_weights
    out = self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 265, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device,
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\_dynamo\eval_frame.py", line 838, in _fn
    return fn(*args, **kwargs)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 229, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 201, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\gguf_connector\quant2.py", line 19, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\gguf_connector\quant2.py", line 31, in dequantize
    blocks = dequantize_blocks(blocks, block_size, type_size, dtype)
  File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\gguf_connector\quant2.py", line 111, in dequantize_blocks_Q6_K
    ql = (ql & 15).reshape((n_blocks, -1, 32))
  File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\_tensor.py", line 1668, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 126.00 MiB. GPU 0 has a total capacity of 10.00 GiB of which 0 bytes is free. Of the allocated memory 8.86 GiB is allocated by PyTorch, and 351.38 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue: OoM after multiple generations #279

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue: OoM after multiple generations #279

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions