Skip to content

KristofarStavrev/huggingface_model_deployment

Repository files navigation

huggingface_model_deployment

Current infrastructure

Infrastructure

Tests coverage report

Tests coverage report

Grafana Dashboard with Prometheus metrics and Loki logs

Dashboard image 1 Dashboard image 2 Dashboard image 3

TODO:

  • Upload model to HuggingFace modelhub
  • Productionize and modularize notebook code
  • Makefile
  • Dockerize
    • Integrate poetry
    • Separate the fastAPI backend and gradio frontend services in different containers
    • Handle them with Docker compose
  • GitHub CI/CD
    • Use a self-hosted CI/CD runner
    • Use a self-hosted docker image repository
    • Self-host the deployment environment
    • Clean-up old images/containers on the prod server
    • Enable image caching in the self-hosted runner
    • Create an automated release when a new tag is created
    • FOR FUTURE IMPROVEMENTS: Tag docker images with release/commit tag
    • FOR FUTURE IMPROVEMENTS: Alternative tools would be Jenkins, Argo, CircleCI
  • Code styling, linting, security and tests
    • Pytest for unit/mock/integration tests
    • Code coverage
    • Bandit and pip-audit for catching security flaws
    • Mypy for typechecking
    • ruff for linting
    • Integrate tests in the CI/CD (Optionally use Nox to orchestrate)
    • FOR FUTURE IMPROVEMENTS: More strict code styling and introduce pre-commit hooks
    • FOR FUTURE IMPROVEMENTS: Stub files that mypy can use for the custom modules (model_utils.py)
  • Logs and messages
    • Logging in all src codes
    • Loki + Promtail
    • Prometheus
    • Grafana
    • Create metrics for Prometheus
    • Create a few Grafana dashboards
    • FOR FUTURE IMPROVEMENTS: Kafka
    • FOR FUTURE IMPROVEMENTS: ELK (Elasticsearch, Logstash, Kibana)
  • Deployment and orchestration
    • Kubernetes theoretical + terminology - cluster, master, workers, pods, deployment, ConfigMaps, secrets, services, ingress, HorizontalPodAutoscaler, rolling updates, probes (liveness/readiness)
    • Kubernetes practical
      • Set-up the full-scale-like production like cluster with multiple machines
      • Move the self hosted docker image repository on the cluster
      • Set-up the GPUs on the K8s cluster
      • Deploy the App using a raw K8s manifest
      • Get the unique IDs to know from which container the response comes
      • Helmify the kubernetes yaml files (create Helmcharts)
    • K9s workflow
    • Kserve
    • Kubeflow
  • Code refactoring and finishing touches
    • Training/Validation/Testing scripts and modularity
    • Documentation (MkDocs / Sphinx) + Docstrings
    • Documentation for the GitHub Read.me - how to start the app, etc.
    • Use the GPU instead of the CPU in model_utils
  • [IN PROGRESS] Create an diagram for the entire architecture (CI/CD, model retraining, etc.)
  • FOR FUTURE IMPROVEMENTS: Deploy in AWS - EC2 or ECS
  • FOR FUTURE IMPROVEMENTS: Terraform/Ansible for infrastructure
  • FOR FUTURE IMPROVEMENTS: User feedback system for model accuracy, Model tracking, data drift tracking, automated retraining (Evidently)
  • FOR FUTURE IMPROVEMENTS: Airflow/Dagster/Prefect/Argo
  • FOR FUTURE IMPROVEMENTS: Weights & Biases (W&B) - paid alternative to MLflow
  • FOR FUTURE IMPROVEMENTS: ONNX/TorchScript (for Pytorch environment) - used to serialize and export models for cross-framework compatibility and efficient inference
  • FOR FUTURE IMPROVEMENTS: TorchServe / TensorFlow Serving / Triton Inference Server - for serving models at scale
  • FOR FUTURE IMPROVEMENTS: DVC for storing and versioning data and model weights
  • FOR FUTURE IMPROVEMENTS: Distributed Computing - Dask, Spark
  • FOR FUTURE IMPROVEMENTS: GPU accelerated data science: CuDF, CuML
  • FOR FUTURE IMPROVEMENTS: Feature stores
  • PyTorch Lightning - streamline the process of developing, training, and scaling deep learning pytorch models

Useful MLOps learning path: https://github.com/graviraja/MLOps-Basics

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •