Skip to content

Latest commit

 

History

History
138 lines (88 loc) · 6.13 KB

File metadata and controls

138 lines (88 loc) · 6.13 KB

Frequently Asked Questions

What is Retrieval-Augmented Generation?

Retrieval-Augmented Generation (RAG) is a method used in artificial intelligence, particularly in natural language processing, to generate text responses that are both contextually relevant and rich in content using AI models.

At its core, RAG involves two main components:

  • Retriever: Think "like a search engine", finding relevant information from a knowledgebase, usually a vector database. In this sample, we're using Azure AI Search as our vector database.

  • Generator: Acts like a writer, taking the prompt and information retrieved to create a response. We're using here a Large Language Model (LLM) for this task.

Retrieval-Augmented Generation schema
How can we upload additional documents without redeploying everything?

To upload more documents, first put your PDF document in the data/ folder, then use one of these commands depending on your environment.

For local development

Make sure your API is started by running npm run start:api from the root of the project. Then you can use one of the following commands to upload a new PDF document:

# If you're using a POSIX shell
curl -F "file=@data/<your-document.pdf>" http://localhost:7071/api/documents

# If you're using PowerShell
Invoke-RestMethod -Uri "http://localhost:7071/api/documents" -Method Post -InFile "./data/<your-document.pdf>"

You can also use the following command to reupload all PDFs file in the /data folder at once:

npm run upload:docs

For the deployed version

First you need to find the URL of the deployed function. You can either look at the packages/api/.env file and search for the API_URI variable, or run this command to get the URL:

azd env get-values | grep API_URI

Then you can use the one of the following commands to upload a new PDF document:

# If you're using a POSIX shell
curl -F "file=@data/<your-document.pdf>" <your_api_url>/api/documents

# If you're using PowerShell
Invoke-RestMethod -Uri "<your_api_url>/api/documents" -Method Post -InFile "./data/<your-document.pdf>"

You can also use the following command to reupload all PDFs file in the /data folder at once:

node scripts/upload-documents.js <your_api_url>
Why do we need to break up the documents into chunks?

Chunking allows us to limit the amount of information we send to the LLM due to token limits. By breaking up the content, it allows us to easily find potential chunks of text that we can inject and improve the relevance of the results. The method of chunking we use leverages a sliding window of text such that sentences that end one chunk will start the next. This allows us to reduce the chance of losing the context of the text.

How do you change the models used in this sample?

You can use the environment variables to change the chat and embeddings models used in this sample when deployed. Run these commands:

azd env set AZURE_OPENAI_API_MODEL gpt-4
azd env set AZURE_OPENAI_API_MODEL_VERSION  0125-preview
azd env set AZURE_OPENAI_API_EMBEDDINGS_MODEL text-embedding-3-large
azd env set AZURE_OPENAI_API_EMBEDDINGS_MODEL_VERSION 1

You may also need to adjust the capacity in infra/main.bicep file, depending on how much TPM your account is allowed.

Local models

To change the local models used by Ollama, you can edit the file packages/api/src/constants.ts:

export const ollamaEmbeddingsModel = 'all-minilm:l6-v2';
export const ollamaChatModel = 'mistral';

You can see the complete list of available models at https://ollama.ai/models.

After changing the models, you also need to fetch the new models by running the command:

ollama pull <model-name>
What does the azd up command do?

The azd up command comes from the Azure Developer CLI, and takes care of both provisioning the Azure resources and deploying code to the selected Azure hosts.

The azd up command uses the azure.yaml file combined with the infrastructure-as-code .bicep files in the infra/ folder. The azure.yaml file for this project declares several "hooks" for the prepackage step and postprovision steps. The up command first runs the prepackage hook which installs Node dependencies and builds the TypeScript files. It then packages all the code (both frontend and backend services) into a zip file which it will deploy later.

Next, it provisions the resources based on main.bicep and main.parameters.json. At that point, since there is no default value for the OpenAI resource location, it asks you to pick a location from a short list of available regions. Then it will send requests to Azure to provision all the required resources. With everything provisioned, it runs the postprovision hook to process the local data and add it to an Azure AI Search index.

Finally, it looks at azure.yaml to determine the Azure host (Functions and Static Web Apps, in this case) and uploads the zip to Azure. The azd up command is now complete, but it may take some time for the app to be fully available and working after the initial deploy.

Related commands are azd provision for just provisioning (if infra files change) and azd deploy for just deploying updated app code.