Skip to content

Issue: OoM after multiple generations #279

@freelancer2000

Description

@freelancer2000

After generating several images with FLUX on RTX 3080 10GB I am running OoM until I close and restart RuinedFooocus.

(Using flux1-dev-Q4_K_S.gguf model)

LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 2.0)]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:06<00:00, 3.31s/it]
Time taken: 70.92 seconds Pipeline process
Loading LoRAs: fluxlisimo_biomech_lora-FLUX_v1.safetensors
LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 1.5)]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [02:10<00:00, 6.52s/it]
Time taken: 138.08 seconds Pipeline process
Loading LoRAs: fluxlisimo_biomech_lora-FLUX_v1.safetensors
LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 1.2)]
70%|█████████████████████████████████████████████████████████▍ | 14/20 [00:46<00:19, 3.29s/it]
Time taken: 52.30 seconds Pipeline process
Traceback (most recent call last):
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\queueing.py", line 625, in process_events
response = await route_utils.call_process_api(
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\blocks.py", line 2191, in process_api
result = await self.call_function(
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\blocks.py", line 1688, in call_function
processed_input, progress_index, _ = special_args(
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\helpers.py", line 968, in special_args
inputs.insert(i, type_hint(event_data.target, event_data._data))
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\gradio\events.py", line 191, in init
self.value: Any = data["value"]
KeyError: 'value'
Loading LoRAs: fluxlisimo_biomech_lora-FLUX_v1.safetensors
LoRAs loaded: [('fluxlisimo_biomech_lora-FLUX_v1.safetensors', 1.0)]
Time taken: 2.41 seconds Pipeline process
Exception in thread Thread-10 (worker):
Traceback (most recent call last):
File "threading.py", line 1016, in _bootstrap_inner
File "threading.py", line 953, in run
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 422, in worker
handler(task)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 409, in handler
process(gen_data)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 366, in process
_process(gen_data)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\async_worker.py", line 249, in _process
imgs = pipeline.process(
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\sdxl_pipeline.py", line 534, in process
if self.textencode("+", positive_prompt, clip_skip):
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\modules\sdxl_pipeline.py", line 468, in textencode
self.conditions[id]["cache"] = CLIPTextEncode().encode(
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\nodes.py", line 69, in encode
return (clip.encode_from_tokens_scheduled(tokens), )
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd.py", line 166, in encode_from_tokens_scheduled
pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd.py", line 228, in encode_from_tokens
o = self.cond_stage_model.encode_token_weights(tokens)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\text_encoders\flux.py", line 53, in encode_token_weights
t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 45, in encode_token_weights
o = self.encode(to_encode)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 288, in encode
return self(tokens)
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 250, in forward
embeds, attention_mask, num_tokens = self.process_tokens(tokens, device)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\sd1_clip.py", line 204, in process_tokens
tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32)
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\ComfyUI\comfy\ops.py", line 237, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 235, in forward_comfy_cast_weights
out = self.forward_ggml_cast_weights(input, *args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 265, in forward_ggml_cast_weights
weight, _bias = self.cast_bias_weight(self, device=input.device,
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch_dynamo\eval_frame.py", line 838, in _fn
return fn(*args, **kwargs)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 229, in cast_bias_weight
weight = s.get_weight(s.weight.to(device), dtype)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\pig.py", line 201, in get_weight
weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\gguf_connector\quant2.py", line 19, in dequantize_tensor
return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\gguf_connector\quant2.py", line 31, in dequantize
blocks = dequantize_blocks(blocks, block_size, type_size, dtype)
File "J:\RuinedFooocus_main_2.0.0.win64\RuinedFooocus\repositories\calcuis_gguf\gguf_connector\quant2.py", line 111, in dequantize_blocks_Q6_K
ql = (ql & 15).reshape((n_blocks, -1, 32))
File "J:\RuinedFooocus_main_2.0.0.win64\python_embeded\lib\site-packages\torch_tensor.py", line 1668, in torch_function
ret = func(*args, **kwargs)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 126.00 MiB. GPU 0 has a total capacity of 10.00 GiB of which 0 bytes is free. Of the allocated memory 8.86 GiB is allocated by PyTorch, and 351.38 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions