Banking LLM monitoring and feedback loop demo

This demo showcases how to train, deploy, and monitor LLM using an approach described as LLM as a judge.

Overview

This demo illustrates training an open-source model to answer banking-related questions only. It does this by analyzing the responses that were generated by the model traffic, and retraining the model according to the performance. The model performance analysis is done by a separate LLM that judges the results. Once the dataset is large enough, you can then retrain the model and mesure the performance again.

Prerequisites

This demo requires an available GPU.

Installation guide

Install and Import Requirements

This demo uses the following python packages:

mlrun - Iguazio's MLRun to orchestrate the entire demo.
openai - OpenAI's ChatGPT as the LLM Judge.
transformers - Hugging Face's Transformers for using Google's google-gemma-2b LLM.
datasets - Hugging Face's datasets package for loading the banking dataset used in the demo.
trl - Hugging Face's TRL for the ORPO fine-tuning.
peft - Hugging Face's PEFT for the LORA adapter fine-tuning.
bitsandbytes - Hugging Face's BitsAndBytes for loading the LLM
sentencepiece - Google's tokenizer for Gemma-2B.

Note: This demo uses the gemma-2b model by Google. This model is publicly accessible, but if you want to use it then you have to first read and accept its terms and conditions. Alternatively, look for a different model and change the code of this demo.

import sys

%pip install -U -r requirements.txt
if sys.version_info.major == 3 and sys.version_info.minor == 9:
    %pip install protobuf==3.20.3

import os
import random
import time
import dotenv   
import pandas as pd
from tqdm.notebook import tqdm
from datasets import load_dataset
import sys
import shutil
import mlrun
from mlrun.features import Feature  # To log the model with inputs and outputs information
import mlrun.common.schemas.alert as alert_constants  # To configure an alert
from mlrun.model_monitoring.helpers import get_result_instance_fqn  # To configure an alert

from src.llm_as_a_judge import OpenAIJudge
pd.set_option("display.max_colwidth", None)

Set Credentials

Hugging Face Access Token can be created and used from the account settings access tokens.
OpenAI Secret API key can be found on the API key page

dotenv.load_dotenv() #you can create a .env file with the following variables, HF_TOKEN, OPENAI_API_KEY, OPENAI_BASE_URL

OPENAI_MODEL = "gpt-4o"

Demo flow

LLM as a Judge

Notebook: llm-monitoring-main.ipynb
Description: Usa a dataset and two types of prompt templates (to understand the difference) and get the LLM to function well as a judge.
Key steps:
- Load the banking dataset
- Create an accuracy metric
- Create the evaluation set
- Prompt engineering the judge
Key files:
- llm_as_a_judge
- model_monitoring_utils

MLRun's Model Monitoring

Notebook: llm-monitoring-main.ipynb
Description: Deploy model monitoring and the google-gemma-2b LLM
Key steps:
- Deploy the model monitoring application
- DeepEval model monitoring function
- Deploy the LLM
- Configure an alert
- Check the performance of the base model
- Evaluate the model using DeepEval on banking and non-banking questions
Key files:

ORPO Fine-tuning

Notebook: llm-monitoring-main.ipynb
Description: Create a fine-tuned model that only answers banking-questions.
Key steps:
- Build the training set
- Fine-tune the model
- Check the performance
- Evaluate the model using DeepEval

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
images		images
src		src
README.md		README.md
llm-monitoring-main.ipynb		llm-monitoring-main.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Banking LLM monitoring and feedback loop demo

Overview

Prerequisites

Installation guide

Install and Import Requirements

Set Credentials

Demo flow

About

Uh oh!

Releases 29

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

mlrun/demo-monitoring-and-feedback-loop

Folders and files

Latest commit

History

Repository files navigation

Banking LLM monitoring and feedback loop demo

Overview

Prerequisites

Installation guide

Install and Import Requirements

Set Credentials

Demo flow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 29

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages