Deploy ML Model with FastAPI

Deploying Models with FastAPI

This will show the basics of deploying a model with FastAPI. The model is a common object detection model.

Installation through requirements.txt

Put the following in a requirements.txt file and install it a venv with python 3.8.

cvlib==0.2.6
opencv-python-headless==4.5.3.56
Pillow==8.4.0.
tensorflow==2.7.0
uvicorn==0.16.0
python-multipart==0.0.5
fastapi==0.70.1
nest-asyncio==1.5.4 # optional and if you want to run jupyter notebook instead
jupyterlab==3.2.5 # optional and if you want to run jupyter notebook instead

FastAPI

We will be using a client-server model, the server hosting the models and the client making prediction requests. It's easy to configure a server with FastAPI, as easy as:

app = FastAPI()

And to get the server running, it's as easy as:

uvicorn.run(app)

Your API is coded using fastAPI but the serving is done using uvicorn, which is a really fast Asynchronous Server Gateway Interface (ASGI) implementation. Furthermore, FastAPI has a native client to interact with the server and can be accessed by the /docs endpoint. For instance, if you are running your FastAPI server on local host then you can access the client by http://localhost:8000/docs. The port might be different instead of 8000, however, it's configurable.

The client and the server communicate with each other through a protocol called HTTP. The key concept here is that this communication between client and server uses some verbs to denote common actions. Two very common verbs are:

GET -> Retrieves information from the server.
POST -> Provides information to the server, which it uses to respond. If your client does a GET request to an endpoint of a server you will get some information from this endpoint without the need to provide additional information. In the case of a POST request you are explicitly telling the server that you will provide some information for it that must be processed in some way.

You can host multiple Machine Learning models on the same server. For this to work, you can assign a different endpoint to each model so you always know what model is being used. For example, if you have the website mymlmodels.com, and you would like to host multiple models, you could have the following endpoints:

mymlmodels.com/classify-f1-cars/
mymlmodels.com/predict-the-next-financial-crisis/
mymlmodels.com/how-to-become-a-billionaire/ To formulate the same in FastAPI, it's as easy as:

@app.post("/classify-f1-cars")
def classify_f1_cars(model, image, confidence_level):
	...
	...
	...
	...

However, if you would just want to provide some information to the client then:

@app.get("/server-information")
def provide_info():
	...
	...
	...
	return "Some information..."

Notice that the above get request doesn't require any parameters in the function, however, the post does require as interactions with ML models living on endpoints are usually done via a POST request since you need to provide the information that is required to compute a prediction. This code will spin a server for you in local host and you will be able to access it via http://localhost:8000/docs. The most essential bits of the codes are below:

import uvicorn
from fastapi import FastAPI, UploadFile, File

app = FastAPI(title='Deploying a ML Model with FastAPI')

@app.get("/info")
def server_information():
	...
	...
	...

@app.post("/predict") 
def prediction(model: Model, file: UploadFile = File(...)):
	...
	...
	...

host = "127.0.0.1"

# Spin up the server!    
uvicorn.run("deploying_object_classification_model_w_FastAPI:app", host=host, port=8000,reload=True)

You can try it yourself by running either of the commands, make sure that have enabled your venv:

python deploying_object_detection_model_w_FastAPI.py

or

uvicorn deploying_object_detection_model_w_FastAPI:app --reload

You will also be able to notice the sample images under here and upon running the code you will be able to see the predictions under the ml_productionisation/images_with_boxes/ directory, which will be automatically created once you run the script.

Interacting with the server

The obvious way is to access it via http://localhost:8000/docs, however, one can also interact with it using code. To interact with code, one needs to use the requests library. To send a POST request to the /predict endpoint, one must supply the following:

base_url = 'http://localhost:8000'
endpoint = '/predict'
model = 'yolov3-tiny'
url_with_endpoint_no_params = base_url + endpoint
full_url = url_with_endpoint_no_params + "?model=" + model

Notice that to send a POST request to the /predict endpoint we need to add a "?" character followed by the name of the parameter and its value. So the entire URL would look like this 'http://localhost:8000/predict?model=yolov3-tiny'. As the /predict endpoint does require a second parameter i.e. file(image), we need to pass this as well but we won't be passing it within the URL but rather using the requests library instead. The entire code is here, however some of the important snippets are below:

def response_from_server(url, image_file, verbose=True):
	files = {'file': image_file}
    response = requests.post(url, files=files)
    return response

def display_image_from_response(response, filename):
	image_stream = io.BytesIO(response.content)
    image_stream.seek(0)
    file_bytes = np.asarray(bytearray(image_stream.read()), dtype=np.uint8)
    image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
    cv2.imwrite(f'images_predicted/{filename}', image)

Note: The entire code should be run after running the server(deploying_object_detection_model_w_FastAPI.py).

Including confidence level to `/predict` endpoint

Additionally, one can also include the confidence level as a separate parameter to the /predict endpoint, since the cv.detect_common_objects method has an argument confidence. To do so, one needs to change the /predict endpoint to:

@app.post("/predict") 
def prediction(model: Model, file: UploadFile = File(...), confidence: str):
	...
	...
	# Run object detection
    bbox, label, conf = cv.detect_common_objects(image, model=model, confidence=confidence)
    ...

The URL must be also changed to 'http://localhost:8000/predict?model=yolov3-tiny&confidence=0.3'.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deploy ML Model with FastAPI

Deploying Models with FastAPI

Installation through requirements.txt

FastAPI

Interacting with the server

Including confidence level to `/predict` endpoint

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Deploy ML Model with FastAPI

Deploying Models with FastAPI

Installation through requirements.txt

FastAPI

Interacting with the server

Including confidence level to /predict endpoint

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Including confidence level to `/predict` endpoint