Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory. 12G VRAM is not enough? #21

Open
wangjia184 opened this issue Jun 6, 2024 · 4 comments
Open

CUDA out of memory. 12G VRAM is not enough? #21

wangjia184 opened this issue Jun 6, 2024 · 4 comments

Comments

@wangjia184
Copy link

CUDA out of memory. Tried to allocate 6.51 GiB. GPU 0 has a total capacity of 11.73 GiB of which 911.38 MiB is free. Process 69810 has 10.82 GiB memory in use. Of the allocated memory 10.39 GiB is allocated by PyTorch, and 194.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

It cannot be run in my RTX 4070 Ti with 12G VRAM.
How much VRAM is needed to run?

I see it loads three models, SAM / VitMatte / GroundingDino, can they be loaded and unloaded in order to reduce memory occupy?

@YeL6
Copy link
Collaborator

YeL6 commented Jun 6, 2024

CUDA out of memory. Tried to allocate 6.51 GiB. GPU 0 has a total capacity of 11.73 GiB of which 911.38 MiB is free. Process 69810 has 10.82 GiB memory in use. Of the allocated memory 10.39 GiB is allocated by PyTorch, and 194.87 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

It cannot be run in my RTX 4070 Ti with 12G VRAM. How much VRAM is needed to run?

I see it loads three models, SAM / VitMatte / GroundingDino, can they be loaded and unloaded in order to reduce memory occupy?

Thank you for your interest in our work! We appreciate your question!

Matte_Anything is an interactive matting tool that follows the image you provide. Because of this, the required VRAM size is significantly related to the resolution of the image you supply. You can try reducing the image resolution to ensure Matte-Anything runs successfully on your device. I recently tried using an image with a resolution of 1280x1280, which required about 15GB of VRAM. When I reduced the resolution to 720x1280, it required about 11GB of VRAM. You can experiment with different resolutions!

Regarding your second question, I think directly uninstalling may not be feasible, but you can try using some lightweight SAM models instead.

@wangjia184
Copy link
Author

wangjia184 commented Jun 6, 2024

@YeL6 Thanks for bringing this amazing project

As I understand, SAM first segments the picture and generate trimap basing on user-input, then trimap is used to guide VitMatte to perform alpha matting. I am not quite sure why DINO is involved here.

but it seems to me, these models are not used at the same time. Is it possible to load each model in child process which terminates itself after prediction? so the VRAM will be freed

Here is an example:

import os
import tempfile
import asyncio
import sys

#current python script
current_script = os.path.abspath(__file__)

async def process(image_filename):
    #start child process by executing current script and append a parameter
    proc = await asyncio.create_subprocess_exec(
        sys.executable,
        current_script,
        image_filename,
        stdout=asyncio.subprocess.PIPE,
        stdin=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )

    stdout = await proc.stdout.read()
    stderr = await proc.stderr.read()
    
    returncode = await proc.wait()
    # determine success or failure and return result
    print(returncode)
    print(stdout)
    print(stderr)
    return stdout

# when this file executes in child process
if __name__ == "__main__":
    if len(sys.argv) != 2:
        print(f"Usage: python {os.path.basename(current_script)} image_filename")
        sys.exit(1)

    # Retrieve the parameters
    image_filename = sys.argv[1]

    # Use the parameters in your script
    print("Parameter 1:", image_filename)

@wangjia184
Copy link
Author

I tried the biggest model vit_h and it works well. So the OOM issue can be avoided if only loading a model at a time.

@imperator-maximus
Copy link

I would suggest another approach here using Guided Filter.
https://github.com/perrying/guided-filter-pytorch
So use lower res like 1024x1024 only and use Guided Filter for bring that up to any high res (e.g. 4000x4000).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants