Skip to content

Commit

Permalink
feat: replaced NIM architecture diagram with self-made NIM on EKS arc…
Browse files Browse the repository at this point in the history
…hitecture diagram
  • Loading branch information
hustshawn committed Aug 18, 2024
1 parent 4e3ca79 commit 60e5e08
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 4 deletions.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 2 additions & 4 deletions website/docs/gen-ai/inference/nvidia-nim-llama3.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,6 @@ NIM abstracts away model inference internals such as execution engine and runtim

NIMs are packaged as container images on a per model/model family basis. Each NIM container is with a model, such as `meta/llama3-8b-instruct`. These containers include a runtime that runs on any NVIDIA GPU with sufficient GPU memory, but some model/GPU combinations are optimized. NIM automatically downloads the model from NVIDIA NGC Catalog, leveraging a local filesystem cache if available.

![NIM Architecture](img/nim-architecture.png)

Source: https://docs.nvidia.com/nim/large-language-models/latest/introduction.html#architecture

## Overview of this deployment pattern on Amazon EKS

This pattern combines the capabilities of NVIDIA NIM, Amazon Elastic Kubernetes Service (EKS), and various AWS services to deliver a high-performance and cost-optimized model serving infrastructure.
Expand All @@ -48,6 +44,8 @@ This pattern combines the capabilities of NVIDIA NIM, Amazon Elastic Kubernetes

By combining these components, our proposed solution delivers a powerful and cost-effective model serving infrastructure tailored for large language models. With NVIDIA NIM's seamless integration, Amazon EKS's scalability with Karpenter, customers can achieve high performance while minimizing infrastructure costs.

![NIM on EKS Architecture](img/nim-on-eks-arch.png)

## Deploying the Solution

### Prerequisites
Expand Down

0 comments on commit 60e5e08

Please sign in to comment.