Sunbird API Inference Service

This repository contains code for a flask server that's containerized and deployed to Vertex AI on GCP. and Azure Machine Learning Studio (https://studio.azureml.net/)

The flask server provides access to the following Sunbird AI models:

The process of deployment is as follows:

The models are pulled from HuggingFace. See asr_inference and translate_inference.
The flask app exposes 2 endpoints: isalive and predict as required by Vertex AI. The predict endpoint receives a list of inference requests, passes them to the model and returns the results.
A docker container is built from this flask app and is pushed to the Google container repository (GCR).
On Vertex AI, a "model" is created from this container and then deployed to a Vertex endpoint, this is the same for azure

NOTE: Check out this article for a detailed tutorial on this process for GCP.

The resulting endpoint is then used in the main Sunbird AI API.

Add TTS (This is available for azure)
Handle long audio files (This is available for azure).
Use a smaller base container, current container (huggingface/transformers-pytorch-gpu) is pretty heavy and maybe unncessary. This would enable us to end up with a smaller artificat which takes up less memory.
Automate the deployment process for both the API and this inference service (using Github Actions or Terraform...or both?)
Come up with an end-to-end workflow from data ingestion to deployment (what tools are required for this?).

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Azure		Azure
GCP		GCP
.gitignore		.gitignore
README.md		README.md

Provide feedback