Skip to content

Deploying ML Model with FastAPI and Docker

shamik edited this page Nov 6, 2022 · 2 revisions

One can deploy any ML Model with FastAPI and a wiki for the same has been also created. Refer here for more info.

Setting up a ML Model with FastAPI(single data point)

This is just an illustration for deploying a Random Forest classification model for the wine dataset. The code for the FastAPI server is here.

Again, much of the code is similar to what has been discussed in another wiki, if needed please refer here for more info.

The part of the code, which is new is:

# Represents a particular wine (or datapoint)
class Wine(BaseModel):
    alcohol: float
    malic_acid: float
    ash: float
    alcalinity_of_ash: float
    magnesium: float
    total_phenols: float
    flavanoids: float
    nonflavanoid_phenols: float
    proanthocyanins: float
    color_intensity: float
    hue: float
    od280_od315_of_diluted_wines: float
    proline: float


@app.on_event("startup")
def load_clf():
    # Load classifier from pickle file
    with open(f"{os.getcwd()}/wine.pkl", "rb") as file:
        global clf
        clf = pickle.load(file)

The Wine class represents a data point of the Wine dataset. It's represented as a class here, with each attribute and their respective type. The BaseModel class is inherited from pydantic, more documentation [here] (https://pydantic-docs.helpmanual.io/usage/models/).

The other piece of code is @app.on_event("startup"), which ensures that the function which is decorated with this is run at the startup of the server. Since, we are loading the pre-trained wine model and making it available by making it a global variable, this function will be executed at the start of the server.

As always one can test the FastAPI deployment by running uvicorn main:app --reload while on the same directory as the main.py file.

Creating a Dockerfile to host the FastAPI server

The contents of the full Dockerfile is here.

A refresher on the commands of the docker and how to build a docker file can be accessed here.

The base image that's being used here is frolvlad/alpine-miniconda3:python3.7, alpine is a very light weight version of Linux and the image already includes miniconda3:python3.7. Instead of installing the dependencies directly, a venv should be created wherein all the dependencies must be installed.

Another note is to expose the port that you can reach the server at and the below command exposes port 80:

EXPOSE 80

The final command in the Dockerfile runs the server on port 80.

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]

Building and running the docker image

Building the image Either be inside the directory, which contains the Dockerfile or provide the entire path where the Dockerfile is stored.

# when inside the directory which contains Dockerfile
docker build -t deploying_ml_model:no_batch .
# providing the path to the Dockerfile
docker build -t deploying_ml_model:no_batch ./deploying_model_w_docker_and_FastAPI/no-batch/

Running the docker image and removing the container after the container has been stopped

docker run --rm -p 80:80 deploying_ml_model:no_batch

Upon successful running the container, the server can be accessed at localhost:80.

Sending requests to the server

curl -X 'POST' http://localhost/predict \
  -H 'Content-Type: application/json' \
  -d '{
  "alcohol":12.6,
  "malic_acid":1.34,
  "ash":1.9,
  "alcalinity_of_ash":18.5,
  "magnesium":88.0,
  "total_phenols":1.45,
  "flavanoids":1.36,
  "nonflavanoid_phenols":0.29,
  "proanthocyanins":1.35,
  "color_intensity":2.45,
  "hue":1.04,
  "od280_od315_of_diluted_wines":2.77,
  "proline":562.0
}'

There are some examples in this directory, which are of the same format above and these files can be also sent to the server directly by:

curl -X POST http://localhost:80/predict \
    -d @./wine-examples/1.json \
    -H "Content-Type: application/json"

The description of the flags:

  • -X: Allows you to specify the request type. In this case it is a POST request.
  • -d: Stands for data and allows you to attach data to the request.
  • -H: Stands for Headers and it allows you to pass additional information through the request. In this case it is used to the tell the server that the data is sent in a JSON format.

Setting up a ML Model with FastAPI(batch data points)

The files are all here.

The main differences are in main.py.

# Represents a batch of wines
class Wine(BaseModel):
    batches: List[conlist(item_type=float, min_items=13, max_items=13)]

By setting the attribute batches and specifying that it will be of type List of conlists, one can start accepting batches of data. Since FastAPI enforces types of objects you need to explicitly specify them. In this case you know that the batch will be a list of arbitrary size but you also need to specify the type of the elements within that list. You could do a List of Lists of floats but there is a better alternative, using pydantic's conlist. The "con" prefix stands for constrained, so this is a constrained list. This type allows you to select the type of the items within the list and also the maximum and minimum number of items. In this case the model was trained using 13 features so each data point should be of size 13.

There's a difference in the predict method also, which has to be updated to accept a batch instead of a single data point. The only necessary changes are to convert the conlist to an array:

batches = wine.batches
np_batches = np.array(batches)

Sending requests to the server

For a batched prediction request, it's easier to use the example files provided here and use the curl command to send the data to the server:

curl -X POST http://localhost:80/predict \
    -d @./wine-examples/batch_1.json \
    -H "Content-Type: application/json"

Upon successfully running the docker image and sending the prediction requests, you will get an output like:

{"Prediction":[2,1,1,0,1,0,0,1,0,0,2,1,1,1,0,1,1,1,2,2,0,1,2,2,1,1,0,1,2,2,1,2]}