Skip to content

Latest commit

 

History

History
395 lines (335 loc) · 16.4 KB

MANUAL_INSTALLATION.md

File metadata and controls

395 lines (335 loc) · 16.4 KB

Manual Installation for Zero Trust

This guide details the manual installation of the Solution Accelerator within a Zero Trust architecture. It includes prerequisites, a comprehensive list of required resources, and a step-by-step installation process.

If you prefer to proceed with the automated installation, please follow the instructions in the Getting Started section. However, feel free to read this document to understand the process behind the scenes.

Prerequisites

You will also need the following permissions:

  • Role: Owner or Contributor + User Access Administrator
  • Scope: Resource group or Subscription

Alternatively, you can create a Custom Role.

Resources List

Here is the complete list of resources for a standard Zero Trust deployment, including descriptions and SKUs. These defaults have been extensively tested in the automated installation. You can adjust them during manual installation to suit your needs, considering usage factors like user volume and data.

App Services

  • App Service Plan
    Hosts the frontend and function apps.
    • SKU: P0v3
    • Operating System: Linux
    • Zone Redundant: Disabled
  • Function App (Orchestrator)
    Orchestrates the RAG flow.
    • Operating System: Linux
    • LinuxFxVersion: python|3.11
  • Function App (Data Ingestion)
    Supports the Data Ingestion Pipeline.
    • Operating System: Linux
    • LinuxFxVersion: python|3.11
  • App Service (Frontend)
    Provides the Web User Interface.
    • Operating System: Linux
    • LinuxFxVersion: python|3.12
  • Application Insights
    Provides real-time monitoring for apps.
    • Type: Classic

Security

  • Key Vault (Application)
    Stores API keys when needed.
    • SKU: Standard
    • Soft Delete: Enabled
    • Purge Protection: Enabled
  • Key Vault (Test VM Bastion)
    Used by Bastion to store the Test VM password.
    • SKU: Standard
    • Soft Delete: Enabled
    • Purge Protection: Enabled

AI Services

  • Azure AI Services Multi-Service Account
    Reads documents (Data Ingestion) and interacts with users (Web UI).
    • SKU: Standard
  • Azure OpenAI
    Generates responses and vector embeddings.
    • SKU: Standard
    • Deployments:
      • Regional gpt-4o, 40 TPM.
      • text-embedding-ada-002, 40 TPM.
  • Search Service
    Provides vector indexes for the retrieval step.
    • SKU: Standard2
    • Replicas: 1
    • Partitions: 1

Compute

  • Virtual Machine (Test VM)
    Provides access to configure and test the solution after disabling public endpoints.
    • Operating System: Windows (Windows Server 2019 Datacenter)
    • SKU: Standard_D4s_v3 (4 vCPUs, 16 GiB memory)
    • Image Publisher: microsoft-dsvm (Data Science VM)
    • Image Offer: dsvm-win-2019

Storage

  • Storage Account (Documents)
    Stores content used for grounding responses.
    • Performance: Standard
    • Replication: Locally-redundant storage (LRS)
    • Account Type: StorageV2 (general purpose v2)
  • Storage Account (Orchestrator Function App)
    Stores logs, code, and execution state for the Orchestrator Function App.
    • Performance: Standard
    • Replication: Locally-redundant storage (LRS)
    • Account Type: Storage (general purpose v1)
  • Storage Account (Data Ingestion Function App)
    Stores logs, code, and execution state for the Data Ingestion Function App.
    • Performance: Standard
    • Replication: Locally-redundant storage (LRS)
    • Account Type: Storage (general purpose v1)
  • Test VM Disk
    Disk for the Test VM.
    • Disk Size: 128 GiB
    • Storage Type: Premium SSD LRS
    • Operating System: Windows

Database

  • Azure Cosmos DB
    Stores conversation history and metadata to improve quality.
    • Kind: GlobalDocumentDB
    • Database Account Offer Type: Standard
    • Capacity Mode: Provisioned throughput

Networking

  • Virtual Network
    AI Services VNet.
    • Address Space: 10.0.0.0/24

Address range is a suggestion, you should use what works for you.

  • Subnets
    Designate network segments in the AI Services VNet to organize and secure traffic.

    • Subnets:
      • ai-subnet
        10.0.0.0/26
      • app-services-subnet
        10.0.0.192/26
      • database-subnet
        10.0.1.0/26
      • app-int-subnet
        10.0.0.128/26
      • AzureBastionSubnet
        10.0.0.64/26

    Address range is a suggestion, you should use what works for you.

  • Private Endpoints
    Enable private, secure access to Azure services via a virtual network.

    • Private Endpoints (PEs):
      • AI Search Private Endpoint
      • AI Services Private Endpoint
      • Azure OpenAI Private Endpoint
      • CosmosDB Private Endpoint
      • Data Ingestion Function App Private Endpoint
      • Frontend App Service Private Endpoint
      • Key Vault Private Endpoint
      • Orchestrator Function App Private Endpoint
      • Storage Account (Documents) Private Endpoint
  • Private DNS Zones
    Resolve private endpoints to private IPs within a virtual network.

    • Private DNS Zones:
      • App Service and Function Apps Private DNS
        privatelink.azurewebsites.net
      • AI Services Private DNS
        privatelink.cognitiveservices.azure.com
      • Azure OpenAI Private DNS
        privatelink.openai.azure.com
      • Storage Account (Documents) Private DNS
        privatelink.blob.core.windows.net
      • CosmosDB Private DNS
        privatelink.documents.azure.com
      • AI Search Private DNS
        privatelink.search.windows.net
      • Key Vault Private DNS
        privatelink.vaultcore.azure.net
  • Network Interfaces
    Provide connectivity to private endpoints and virtual machines within the AI Services VNet.

    • Interfaces:
      • AI Search PE's Network Interface
      • AI Services PE's Network Interface
      • Azure OpenAI PE's Network Interface
      • CosmosDB PE's Network Interface
      • Data Ingestion Function App PE's Network Interface
      • Frontend App Service PE's Network Interface
      • Key Vault PE's Network Interface
      • Orchestrator Function App PE's Network Interface
      • Storage Account (Documents) PE's Network Interface
      • Test Virtual Machine Network Interface
  • Bastion
    Enables private and secure access to the Test VM without exposing the VM directly to the internet.

    • Tier: Standard
  • Public IP
    Used by Bastion to enable secure access to the Test VM.

    • SKU: Standard
    • Tier: Regional

Installation Procedure

Before You Begin

Gather Necessary Information:

  • Resource Group Name
  • Location
  • AI VNet Address Range
  • Subnets IP Range
    • ai-subnet
    • app-services-subnet
    • database-subnet
    • app-int-subnet
    • AzureBastionSubnet

1. Creation of Core Components

  1. Resource Group

    • Create a resource group in Azure where all components will be deployed.
  2. Virtual Network (AI VNet)

    • Create a VNet with the required subnets:
      • ai-subnet (AI services subnet)
      • AzureBastionSubnet (Bastion subnet)
      • app-int-subnet (App Integration subnet)
      • app-services-subnet (App Services subnet)
      • database-subnet (Database subnet)
  3. Test VM

    • Create a Windows VM in the same VNet to test and access resources without a public address.
    • Create the Data Science VM in the AI Subnet.
    • Set up a Bastion to access the VM from the internet.
    • Optionally, create a KeyVault to store the VM password as a secret.
  4. Database

    • Cosmos DB
      • Create a Cosmos DB account with a database with two containers:
        • conversations
        • models
      • Disable public network access.
  5. AI Services

    • Azure Cognitive Services

      • Create Azure AI Services.
      • Disable public network access.
    • Azure OpenAI

      • Create an Azure OpenAI service
      • Create deployments:
        • Regional gpt-4o, 40 TPM.
        • text-embedding-ada-002, 40 TPM.
      • Disable public network access.
    • Azure Search

      • Create an Azure Search service with standard2 SKU.
      • Create shared private link connection to:
        • Data Ingestion Function App
        • Blob Storage Account (Documents)
  6. Storage

    • Storage Account
      • Create a storage account to store documents
      • Create the following blob containers:
        • documents
        • documents-images
        • documents-raw
      • Disable public access.
      • Enable soft delete for blobs.
  7. Security

    • Key Vault (Application)
      • Create the Azure Key Vault with the following secrets
        • azureOpenAIKey (Azure OpenAI API Key)
        • azureSearchKey (Azure AI Search query API Key)
        • formRecKey (Azure AI Services API Key)
        • speechKey (Azure AI Services API Key)
      • Disable public access.
  8. App Services

    • App Service Plan

      • Create an App Service Plan with the appropriate SKU and OS specifications.
    • Function Apps

      • Create function apps for the orchestrator and data ingestion.
    • Web App (Front-end)

  9. Private DNS Zones

    • Configure private DNS zones for various services
    • AI Search Private Endpoint
    • AI Services Private Endpoint
    • Azure OpenAI Private Endpoint
    • CosmosDB Private Endpoint
    • Data Ingestion Function App Private Endpoint
    • Frontend App Service Private Endpoint
    • Key Vault Private Endpoint
    • Orchestrator Function App Private Endpoint
    • Storage Account (Documents) Private Endpoint
  10. Private Endpoints

    • Set up private endpoints to secure communication within the VNet.
      • AI Search Private Endpoint
      • AI Services Private Endpoint
      • Azure OpenAI Private Endpoint
      • CosmosDB Private Endpoint
      • Data Ingestion Function App Private Endpoint
      • Frontend App Service Private Endpoint
      • Key Vault Private Endpoint
      • Orchestrator Function App Private Endpoint
      • Storage Account (Documents) Private Endpoint
  11. Permissions

    • Assign permissions to the Managed Identities of various components in Azure. In each item, sample commands are provided to assign these roles; please use them by replacing the variables with the values specific to your environment.

    • Storage Account

      • Assign Storage Account "Storage Blob Data Reader" role to the Frontend App Service Managed Identity.
      az role assignment create \
          --assignee $principalId \
          --role "Storage Blob Data Reader" \
          --scope "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.Storage/storageAccounts/$storageAccountName"
      
      • Assign Storage Account "Storage Blob Data Contributor" role to the Data Ingestion Function App Identity.
      az role assignment create \
          --assignee $principalId \
          --role "Storage Blob Data Contributor" \
          --scope "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.Storage/storageAccounts/$storageAccountName"
      
    • Key Vault

      • Assign Key Vault "Key Vault Secrets User" role to the Identities of the following Apps:
        • Orchestrator Function App
        • Data Ingestion Function App
        • Frontend App Service
      az role assignment create \
          --assignee $principalId \
          --role "Key Vault Secrets User" \
          --scope "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.KeyVault/vaults/$keyVaultName"
      
    • Cosmos DB

      • Assign Cosmos DB "Cosmos DB Built-in Data Contributor" role to Orchestrator's Managed Identity.
      az cosmosdb sql role assignment create \
          --account-name $cosmosDbAccountName \
          --resource-group $resourceGroupName \
          --role-definition-id 00000000-0000-0000-0000-000000000002 \
          --scope "/" \
          --principal-id $principalId
      
    • Azure AI Search

      • Assign AI Search "Search Index Data Reader" role to Orchestrator's Managed Identity.
          az role assignment create \
              --assignee $principalId \
              --role "Search Index Data Reader" \
              --scope "/subscriptions/<subscription-id>/resourceGroups/$resourceGroupName/providers/Microsoft.Search/searchServices/$searchServiceName"
      
    • Azure OpenAI

      • Assign Azure OpenAI "Cognitive Services OpenAI User" role to the Identities of the following Resources:
        • Orchestrator Function App
        • Data Ingestion Function App
          az role assignment create \
              --assignee $principalId \
              --role "Cognitive Services OpenAI User" \
              --scope "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.CognitiveServices/accounts/$openAIAccountName"
      
    • Azure AI Services

      • Assign AI Services "Cognitive Services Contributor" role to the Managed Identities of the following Apps:
        • Frontend App Service
        • Data Ingestion Function App
          az role assignment create \
            --assignee $principalId \
            --role "Cognitive Services Contributor" \
            --scope "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.CognitiveServices/accounts/$aiServicesAccountName"
      
    • Application Insights

      • Assign Application Insights "Application Insights Component Contributor" role to the Managed Identities of the following Apps:
        • Orchestrator Function App
        • Data Ingestion Function App
        • Frontend App Service
          az role assignment create \
              --assignee $principalId \
              --role 'Application Insights Component Contributor' \
              --scope "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/microsoft.insights/components/$appInsightsName"
      
  12. Application Settings

    • The applications (orchestrator, data ingestion, and frontend web app) use information obtained from environment variables. These variables are stored as App Settings in each application. When using the automated procedure, you don't need to worry about this. However, for the manual procedure, you'll need to set these variables manually.
    • Click here to see an example of how to set these variables using the Azure CLI. You will need to adjust the commands to correctly reflect the names of your applications and parameters.
  13. Application Deployment

    • Once resources are provisioned and settings configured, you’re ready to deploy each application.
    • First clone the repositories for each application.
    • For the Orchestrator Function App and Data Ingestion Function App:
      • In VSCode with the Azure Function App Extension, go to the Azure panel, locate your Function App in the resource explorer, right-click on it, and select Deploy.
    • For the App Service Frontend deployment refer to the deployment section in the frontend repo.
  14. External Access

    • To allow internet access to your app, permit access only from Azure Front Door to your App Service.