-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
demo_img2vid: Error Code 1: Cuda Runtime (out of memory) #4353
Comments
What GPU are you using? Most likely your GPU doesn't have enough VRAM required to run this demo. |
@kevinch-nv : My system has an NVIDIA H100 GPU with 80GB of HBM3 memory, but encountering an out-of-memory (OOM) error while building the TensorRT engine for onnx-svd-xt-1-1/unet-temp.opt/model.onnx. Error Code 1: Cuda Runtime (out of memory), Excessive memory request during TensorRT engine building After explicitly passing the parameters --batch-size=1 --use-cuda-graph --height 256 --width 512 |-----------------|--------------|
This is the command where I have two GPUS, for multi-GPU execution |
@kevinch-nv, HI Kevinch, I wanted to follow up on the CUDA out-of-memory (OOM) issue encountered while building the TensorRT engine for svd-xt-1.1 and if it's possible to deploy the SVD-XT-1.1 model using Triton Inference Server, given the current setup and TensorRT engine files. |
Hi @kevinch-nv, I tried with FP8: python3 demo_img2vid.py --version svd-xt-1.1 --onnx-dir onnx-svd-xt-1-1 --engine-dir engine-svd-xt-1-1 --hf-token=$HF_TOKEN --fp8 [E] Error Code: 9: Skipping tactic 0x0000000000000001 due to exception Unsupported data type FP8. |
CUDA_VISIBLE_DEVICES=0,1 python3 demo_img2vid.py --version svd-xt-1.1 --onnx-dir onnx-svd-xt-1-1 --engine-dir engine-svd-xt-1-1 --hf-token=$HF_TOKEN --batch-size=1 --use-cuda-graph
/usr/local/lib/python3.12/dist-packages/modelopt/torch/utils/import_utils.py:25: UserWarning: Failed to import apex plugin due to: ImportError("cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)")
warnings.warn(f"Failed to import {plugin_name} plugin due to: {repr(e)}")
[I] Initializing StableDiffusion img2vid demo using TensorRT
[I] Autoselected scheduler: Euler
[I] Load Scheduler EulerDiscreteScheduler from: pytorch_model/svd-xt-1.1/IMG2VID/eulerdiscretescheduler/scheduler
Building TensorRT engine for onnx-svd-xt-1-1/unet-temp.opt/model.onnx: engine-svd-xt-1-1/unet-temp.trt10.7.0.post1.plan
Strongly typed mode is False for onnx-svd-xt-1-1/unet-temp.opt/model.onnx
/usr/local/lib/python3.12/dist-packages/polygraphy/backend/trt/util.py:590: DeprecationWarning: Use Deprecated in TensorRT 10.1. Superseded by explicit quantization. instead.
calibrator = config.int8_calibrator
[E] [defaultAllocator.cpp::allocate::31] Error Code 1: Cuda Runtime (out of memory)
[E] Error Code: 9: Skipping tactic 0x0000000000000000 due to exception [tunable_graph.cpp:create:118] autotuning: User allocator error allocating 86114304000-byte buffer
[E] [defaultAllocator.cpp::allocate::31] Error Code 1: Cuda Runtime (out of memory)
[E] Error Code: 9: Skipping tactic 0x0000000000000000 due to exception [tunable_graph.cpp:create:118] autotuning: User allocator error allocating 86114304000-byte buffer
The text was updated successfully, but these errors were encountered: