Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of vram and reboot #153

Open
tdzz1102 opened this issue Oct 28, 2023 · 2 comments
Open

Out of vram and reboot #153

tdzz1102 opened this issue Oct 28, 2023 · 2 comments

Comments

@tdzz1102
Copy link
Contributor

tdzz1102 commented Oct 28, 2023

Machine info

  • Hardware
    • 64GB RAM
    • 32 vcpus
    • 2*3090 GPUs
  • Software
    • ubuntu server 20.04
    • python 3.11.2
    • nvidia-driver 545
    • cuda 12.3
    • jax 0.4.19

When I set up the environment and called FlaxWhisperPipline('openai/whisper-xxx')method to load the model, the server rebooted without any error. Only for 'openai/whisper-tiny' can it work correctly, and it will crush loading the 'openai/whisper-small' model and larger. I've tried XLA_PYTHON_CLIENT_PREALLOCATE=false mentioned in issue 7 but it didn't work.

The image below shows the vRAM usage of my machine. Missing data means machine rebooted.

スクリーンショット 2023-10-28 16 58 43

Is there any way to prevent linux to reboot automatically when vRAM usage is high?

@sanchit-gandhi
Copy link
Owner

Hey @tdzz1102 and sorry for the late reply! Could you try with XLA_PYTHON_CLIENT_MEM_FRACTION=.XX (where .XX is the percentage memory allocation you want to assign, e.g. .50 for 50% memory allocation) to reduce the memory fraction allocation? The docs say this should help combat OOMs that occur when the programme starts: https://jax.readthedocs.io/en/latest/gpu_memory_allocation.html

You might need to play with your value of .XX, e.g. incrementally reducing from .75 to .00

@tdzz1102
Copy link
Contributor Author

@sanchit-gandhi I have solved this question by downgrading the nividia-driver and cuda version(but forgot what exactly the version was 😢). Now the server has been expired and I can't try this solution anylonger. The faster whisper has helped me a lot, and thank you anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants