Collection of generative AI prototypes, mainly using LLMs.
- Generative AI prototypes
- Prototypes
- Templating messages and functions
- Message history
- Setup
- Launch the prototypes
- TODO
This prototype uses the OpenAI API to generate an explanation of a concept. The user can enter a concept and the model will explain it in very simple and cheerful terms.
ELI3 has a chat interface, meaning you can ask follow-up questions or another ELI3 question.
This prototype uses the OpenAI API to generate EYFS-related activities. The user queries the model with a topic and the model will generate a list of conversations and activities.
This prototype supports a chat interface similar to the ELI3 prototype.
This prototype uses the OpenAI API to generate EYFS-related activities. It leverages external knowledge bases like BBC's "Tiny Happy People" to append example activities to the prompt based on. The user queries the model with a topic and the model will generate a list of conversations and activities.
Note that to run this prototype, you need to:
- Get in touch for the BBC Tiny Happy People dataset
- Run
python src/genai/eyfs/run_classifier.py
- Run
python src/genai/eyfs/run_pinecone_index.py
This prototype uses a text messaging app (WhatsApp) as an accessible front end to a large language model (LLM), which can explain simple concepts or generate personalised activity ideas. More information can be found here.
This prototype generates early-years activities that are anchored to the Development Matters guidance.
Generate activities based on the user's query and the Development Matters learning goals and examples
We built a parenting chatbot that can answer questions related to pregnancy, babies and toddlers. We used RAG to contextualise our responses based on the NHS Start for Life website.
Firstly, we indexed the NHS Start for Life website using Pinecone.
Then, we built a chatbot in streamlit. The chatbot queries the Pinecone index with the user's input and fetches the top N most similar documents from the Pinecone index. We then pass those documents through an LLM that classifies them as "relevant" or "not relevant" to the user query. We do this as we always get documents, even if they are not relevant to the user query.
Finally, we add the relevant documents to a prompt along with the user question and call an LLM to generate a response.
MessageTemplate
and FunctionTemplate
enables you to define a template for a prompt and work with existing templates. Both classes inherit methods from BasePromptTemplate
.
Let's create an OpenAI message.
from genai import MessageTemplate
# Set a dummy role and content
my_role = "dummy_role"
my_content = "Hello {text}! This is a json template {{'a': '{value}'}}."
# Use the MessageTemplate as a dataclass, great for experimentation
prompt = MessageTemplate(role=my_role, content=my_content)
prompt
# MessageTemplate(initial_template={'role': 'dummy_role', 'content': "Hello {text}! This is a json template {{'a': '{value}'}}."}, role='dummy_role', content="Hello {text}! This is a json template {{'a': '{value}'}}.")
# Fill in your placeholders, `text` and `value`
prompt.format_message(text="world", value=42)
prompt.content
# "Hello world! This is a json template {'a': '42'}."
prompt.role
# 'dummy_role'
prompt.to_prompt()
# {'role': 'dummy_role', 'content': "Hello world! This is a json template {'a': '42'}."}
# Store the template in a JSON file, good for versioning
prompt.to_json(path="prompt.json")
# Load it back
prompt = MessageTemplate.load("prompt.json")
# Or read a dictionary instead
prompt = MessageTemplate.load({"role": my_role, "content": my_content})
Let's create an OpenAI function. The methods are exactly the same as above, only the atttributes are different.
from genai import FunctionTemplate
# Set a dummy role and content
my_name = "dummy_name"
my_description = "This is a dummy description."
my_parameters = {
'type': 'object',
'properties': {'prediction': {'type': 'string',
'description': 'The predicted area of learning',
'enum': "{list_of_dummy_categories}"}},
'required': ['prediction']
}
# Use the PromptTemplate as a dataclass, great for experimentation
prompt = FunctionTemplate(name=my_name, description=my_description, parameters=my_parameters)
prompt
# FunctionTemplate(initial_template={'name': 'dummy_name', 'description': 'This is a dummy description.', 'parameters': {'type': 'object', 'properties': {'prediction': {'type': 'string', 'description': 'The predicted area of learning', 'enum': '{LIST_OF_DUMMY_CATEGORIES}'}}, 'required': ['prediction']}}, name='dummy_role', description='This is a dummy description.', parameters={'type': 'object', 'properties': {'prediction': {'type': 'string', 'description': 'The predicted area of learning', 'enum': '{LIST_OF_DUMMY_CATEGORIES}'}}, 'required': ['prediction']})
# Fill in your placeholders, `text` and `value`
prompt.format_message(list_of_dummy_categories=["A", "B", "C"])
prompt.parameters
# {'type': 'object', 'properties': {'prediction': {'type': 'string', 'description': 'The predicted area of learning', 'enum': "['A', 'B', 'C']"}}, 'required': ['prediction']}
prompt.name
# 'dummy_name'
prompt.to_prompt()
# {'name': 'dummy_name', 'description': 'This is a dummy description.', 'parameters': {'type': 'object', 'properties': {'prediction': {'type': 'string', 'description': 'The predicted area of learning', 'enum': "['A', 'B', 'C']"}}, 'required': ['prediction']}}
# Store the template in a JSON file, good for versioning
prompt.to_json(path="prompt.json")
# Load it back
prompt = FunctionTemplate.load("prompt.json")
# Or read a dictionary instead
prompt = FunctionTemplate.load({"name": my_name, "description": my_description, "parameters": my_parameters})
from genai.message_history import InMemoryMessageHistory
# Instantiate empty history
history = InMemoryMessageHistory()
# Create a bunch of messages
msg1 = {"role": "system", "content": "You are a helpful assistant."}
msg2 = {"role": "user", "content": "Hi bot, I need help."}
msg3 = {"role": "assistant", "content": "Hi human, what do you need?"}
msg4 = {"role": "user", "content": "I need you to tell me a joke."}
msg5 = {"role": "assistant", "content": "What do you call a fish without eyes?"}
msg6 = {"role": "user", "content": "I don't know."}
msg7 = {"role": "assistant", "content": "A fsh."}
messages = [msg1, msg2, msg3, msg4, msg5, msg6, msg7]
# Add them to history
for message in messages:
history.add_message(message)
history.messages
# [{'role': 'system', 'content': 'You are a helpful assistant.'},
# {'role': 'assistant', 'content': 'What do you call a fish without eyes?'},
# {'role': 'user', 'content': "I don't know."},
# {'role': 'assistant', 'content': 'A fsh.'},
# {'role': 'system', 'content': 'You are a helpful assistant.'},
# {'role': 'user', 'content': 'Hi bot, I need help.'},
# {'role': 'assistant', 'content': 'Hi human, what do you need?'},
# {'role': 'user', 'content': 'I need you to tell me a joke.'},
# {'role': 'assistant', 'content': 'What do you call a fish without eyes?'},
# {'role': 'user', 'content': "I don't know."},
# {'role': 'assistant', 'content': 'A fsh.'}]
# Buffer history by max_tokens. Keep the system message.
history.get_messages(model_name="gpt-3.5-turbo", max_tokens=60, keep_system_message=True)
# [{'role': 'system', 'content': 'You are a helpful assistant.'},
# {'role': 'user', 'content': 'I need you to tell me a joke.'},
# {'role': 'assistant', 'content': 'What do you call a fish without eyes?'},
# {'role': 'user', 'content': "I don't know."},
# {'role': 'assistant', 'content': 'A fsh.'}]
Assuming you work on a Mac, you can use the following commands to setup your environment from scratch with pyenv
and poetry
. Please deactivate any anaconda environments you might have activated before the setup.
- Install brew. Confirm you've installed it correctly by running:
brew --version
- Install
pyenv
brew install pyenv
At the end of the installation, pyenv
will advise you to add the following lines to your .bash_profile
(or .zshrc
if you use zsh
). Do that, save the file and restart your terminal.
export PYENV_ROOT="$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
- Install Python
Install/update a few dependencies
brew install openssl readline sqlite3 xz zlib
Install Python 3.9.17
pyenv install 3.9.17
Confirm you've installed it correctly by running:
pyenv versions
Run the following commands to set the global Python version to 3.9.17.
pyenv global 3.9.17
Close and reopen your terminal so that the changed take effect.
- Install
poetry
You can use the official installer:
curl -sSL https://install.python-poetry.org | python3 -
Add poetry to your PATH. Recommended: Add the following lines to your .bash_profile
(or .zshrc
if you use zsh
). Save the file and restart your terminal.
export PATH="/Users/<MYUSERNAME>/.local/bin:$PATH"
Confirm your poetry installation:
poetry --version
- Assuming you have installed
pyenv
andpoetry
as described above, you can now install this project:
make init
- Activate the virtual environment:
source .venv/bin/activate
- Add the secrets to the environment.
- Add your OpenAI API key to the
.env
file. See.env.example
for an example. - The streamlit app is password-protected. You can either remove the password requirement from
app.py
or create a.streamlit/secrets.toml
file and addpassword='<MYPASSWORD>'
.
- Add your OpenAI API key to the
Three of the prototypes use the pinecone database to store and retrieve data. To rebuild the database:
- Create a
.env
file in the root of the project and add all keys listed in the.env.examples
. - Run
make build-pinecone
. This will delete the index if it exists (this happens as part of the first python script it calls) and rebuild it from scratch.
Notes
- We are using a free Pinecone database which is deleted after seven days of inactivity. If you get any errors like "this index does not exist", you might need to rebuild it.
- The Pinecone database indexes the docs for all prototypes and distinguishes them using the
source
metadata field.
You can use the Dockerfile to launch the streamlit app without installing the repo and its dependencies.
-
Add the required secrets.
- Add your OpenAI API key to the
.env
file. See.env.example
for an example. - The streamlit app is password-protected. You can either remove the password requirement from
app.py
or create a.streamlit/secrets.toml
file and addpassword='<MYPASSWORD>'
.
- Add your OpenAI API key to the
-
Assuming Docker is install on your local machine, you can build the image with:
docker build -t <USERNAME>/<YOUR_IMAGE_NAME> -f Dockerfile .
- Then run the image with:
docker run -p 8501:8501 <USERNAME>/<YOUR_IMAGE_NAME>
- You can now access the app at
http://localhost:8501
.
Assuming you are not an admin of this repo, you would need to fork it and deploy the app on Streamlit Cloud using your Streamlit account. Let's see how you can do that.
-
Fork this repo.
-
Create a Streamlit Cloud account and connect it to your GitHub account.
-
Click on the New app button on Streamlit Cloud to create a new app and set the following fields:
- Repository:
<your-githubuser-name>/discovery_generative_ai
- Branch:
dev
- Main file path:
app.py
.
- Repository:
-
Click on Advanced settings and:
- Set Python version to 3.9.
- Add your Secrets using TOML format. You can find the required secrets in the section above, basically all the variables in
.env.example
as well as the password in.streamlit/secrets.toml
.
-
Click on Deploy!.
Note: Streamlit Cloud has a pretty obnoxious requirement; it's only looking for the latest patch release of a Python version. This might lead to errors as the project works with python==3.9.17
and Streamlit Cloud might try to install python==3.9.18
or python==3.9.19
once that's available. To fix that, you would need to update the python version of the project, there's no way around it.
Alternatively, if you would like to deploy the app on a public server, you can use the Dockerfile.heroku
file, which has a few modifications to make it work with Heroku.
First create the app and make sure to add the environment variables to your Heroku app:
heroku create
heroku config:set OPENAI_API_KEY=<your_api_key>
Then build and push the image to Heroku:
heroku container:push heroku --recursive --app <your_app_name>
Finally, release the image and start the app:
heroku container:release heroku
heroku ps:scale web=1