This repository is my official beginning of MLOps journey. Instead of focusing on piece of production-grade machine learning, we will be building full end-to-end pipeline.
We will be training simple regression models on NYC taxi ride dataset and build MLOps pipeline including model training, hyperparameter optimization, experiment tracking, orchestrating, deployment, monitoring, etc. This repository is inspired by the mlops-zoomcamp
course by DataTalks.Club.
Since the MLOps tool landscape is very wide, There will be more follow up work on this with various tech stacks.
Setting up a VM on GCP
Dataset
MLFlow Experiment Tracking
MLFlow Experiment Tracking on GCP
Workflow Orchestration with Prefect
Model Deployment as a web-service with Docker, Kubernetes, and GKS.
Model Deployment with model from model registry
Streaming Model Deployment (Online)
Batch Model Deployment (Offline)
Scheduling batch scoring jobs with Prefect
Monitoring and debugging with Evidently
conda create -n mlops-orbit python=3.9
conda activate mlops-orbit
pip install -r requirements.txt
Forward MLflow port which is 0.0.0.0:5000
.
Forward the port for jupyter
if you are using it (127.0.0.1:8888
).
Forward port for Prefect server (127.0.0.1:4200
).
You can also do it in ~/.ssh/config
.
Host gcp-mlflow-tracking-server
HostName xx.xx.xx.xxx # VM Public IP
User pytholic # VM user
IdentityFile ~/.ssh/mlops-zoomcamp # Private SSH key file
StrictHostKeyChecking no
LocalForward 5001 0.0.0.0:5000
LocalForward 4200 127.0.0.1:4200