New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploying HuggingFace model/pipeline using uvicorn-gunicorn-fastapi-docker on Google Cloud Run #238
Comments
If you use If you use To decide the number of workers: N = number of threads + 1. So if you are GPU limited, that's your criteria to decide the number of workers. What I wrote above is based on what I observed in a few tests. It might well be incorrect. |
I would also recommend using Gunicorn instead of Uvicorn to run the app |
Hi everybody,
I am pretty new to web app development and have doubts about how to make the best out of this incredible docker image.
In short, I have been trying to deploy an huggingface pipeline on Google Cloud Run using the uvicorn-gunicorn-fastapi-docker image. The model takes about 3.5GB, while the base cloud-run instance can have up to 16 vCPUs and 32GB of RAM. At deployment time, I also need to manually specify the maximum number of concurrent requests before autoscaling happens.
How should I set up the number of workers/threads for gunicorn/uvicorn, and the characteristics of the base cloud run instance? I noticed that, for every additional worker and/or thread, 3.5GB of RAM are needed. Also, during execution, memory leakage occurs, which would require a worker to be restarted every now and then.
My naif guess is that I should have as many workers as the number of vCPU and a RAM of at least 3.5GB times the number of workers. Is that correct? What about the number of concurrent requests?
Right now, my uvicorn command in the dockerfile looks like this:
CMD uvicorn main:app --host 0.0.0.0 --port 8080 --workers 4 --access-log --use-colors
Nonetheless, with this setting, after a while the RAM gets saturated that the service breaks down :(
Any help is more then welcome.
Thank you in advance. Best
The text was updated successfully, but these errors were encountered: