add 2.5.0 to tested version

mocsharp · mocsharp · commit 83bdcbd64ed1 · 2024-11-12T10:35:15.000-08:00
Signed-off-by: Victor Chang &lt;vicchang@nvidia.com&gt;
diff --git a/applications/nvidia_nim/nvidia_nim_nvclip/README.md b/applications/nvidia_nim/nvidia_nim_nvclip/README.md
@@ -1,35 +1,62 @@
 # NVIDIA NV-CLIP
 
-NV-CLIP is a multimodal embeddings model for image and text and this is a sample application that shows how to use the OpenAI SDK with NVIDIA Inference Microservice (NIM). Whether you are using a NIM from [build.nvidia.com/](https://build.nvidia.com/) or a self-hosted NIM, this sample application will work for both.
+NV-CLIP is a multimodal embeddings model for image and text, and this is a sample application that shows how to use the OpenAI SDK with NVIDIA Inference Microservice (NIM). Whether you are using a NIM from [build.nvidia.com/](https://build.nvidia.com/) or [a self-hosted NIM](https://docs.nvidia.com/nim/nvclip/latest/getting-started.html#option-2-from-ngc), this sample application will work for both.
 
-### Quick Start
+## Quick Start
 
-1. Add API key in `nvidia_nim.yaml`
+Get your [API Key](https://docs.nvidia.com/nim/nvclip/latest/getting-started.html#generate-an-api-key) and start the sample application.
+
+1. Enter your API key in `nvidia_nim.yaml`
 2. `./dev_container build_and_run nvidia_nim_nvclip`
 
-## Configuring the sample application
+## Advanced
+
+### Configuring the sample application
 
 Use the `nvidia_nim.yaml` configuration file to configure the sample application:
 
-### Connection Information
+### NVIDIA-Hosted NV-CLIP NIM
+
+By default, the application is configured to use NVIDIA-hosted NV-CLIP NIM.
 
 ```
 nim:
-  base_url: https://integrate.api.nvidia.com/v1
-  api_key:
+ base_url: https://integrate.api.nvidia.com/v1
+ api_key:
 
 ```
 
-`base_url`: The URL of your NIM instance. Defaults to NVIDIA hosted NIMs.
-`api_key`: Your API key to access NVIDIA hosted NIMs.
+`base_url`: The URL of your NIM instance. Defaults to NVIDIA-hosted NIMs.
+`api_key`: Your API key to access NVIDIA-hosted NIMs.
 
-## Run the sample application
 
-There are a couple of options to run the sample application:
+Note: you may also configure your API key using an environment variable.
+E.g., `export API_KEY=...`
+
+```bash
+# To use NVIDIA hosted NIMs available on build.nvidia.com, export your API key first
+export API_KEY=[enter your API key here]
+```
+
 
-### Run using Docker
+### Self-Hosted NIMs
 
-To run the sample application with Docker, you must first build a Docker image that includes the sample application and its dependencies:
+To use a self-hosted NIM, refer to the [NV-CLIP](https://docs.nvidia.com/nim/nvclip/latest/getting-started.html) NIM documentation to configure and start the NIM.
+
+Then, comment out the NVIDIA-hosted section and uncomment the self-hosted configuration section in the `nvidia_nim.yaml` file.
+
+```bash
+nim:
+  base_url: http://0.0.0.0:8000/v1/
+  encoding_format: float
+  api_key: NA
+  model: nvidia/nvclip-vit-h-14
+```
+
+
+### Build The Application
+
+To run the sample application, you must first build a Docker image that includes the sample application and its dependencies:
 
 ```
 # Build the Docker images from the root directory of Holohub
@@ -39,35 +66,49 @@ To run the sample application with Docker, you must first build a Docker image t
 Then, run the Docker image:
 
 ```bash
-./dev_container  launch
+./dev_container launch
 ```
 
 
-### Start the Application
+### Run the Application
 
 To use the NIMs on [build.nvidia.com/](https://build.nvidia.com/), configure your API key in the `nvidia_nim.yaml` configuration file and run the sample app as follows:
 
-note: you may also configure your api key using an environment variable.
-E.g., `export API_KEY=...`
-
 ```bash
-# To use NVIDIA hosted NIMs available on build.nvidia.com, export your API key first
-export API_KEY=[enter your api key here]
-
 ./run launch nvidia_nim_nvclip
 ```
 
-Have fun!
+## Using the Application
+
+Once the application is ready, it will prompt you to input URLs to the images you want to perform inference.
+
+```bash
+Enter a URL to an image: https://domain.to/my/image-cat.jpg
+Downloading image...
+
+Enter a URL to another image or hit ENTER to continue: https://domain.to/my/image-rabbit.jpg
+Downloading image...
+
+Enter a URL to another image or hit ENTER to continue: https://domain.to/my/image-dog.jpg
+Downloading image...
+
+```
 
+If there are no more images that you want to use, hit ENTER to continue and then enter a prompt:
 
-## Connecting with Locally Hosted NIMs
+```bash
+Enter a URL to another image or hit ENTER to continue:
 
-To use a locally hosted NIM, first download and start the NIM.
-Then configure the `base_url` parameter in the `nvidia_nim.yaml` configuration file to point to your local NIM instance.
+Enter a prompt: Which image contains a rabbit?
+```
 
-The following example shows a NIM running locally and serving its APIs and the `meta-llama3-8b-instruct` model from `http://0.0.0.0:8000/v1`.
+The application will connect to the NIM to generate an answer and then calculate the cosine similarity between the images and the prompt:
 
 ```bash
-nim:
-  base_url: http://0.0.0.0:8000/v1/
+⠧ Generating...
+Prompt: Which image contains a rabbit?
+Output:
+Image 1: 3.0%
+Image 2: 52.0%
+Image 3: 46.0%
 ```
diff --git a/applications/nvidia_nim/nvidia_nim_nvclip/app.py b/applications/nvidia_nim/nvidia_nim_nvclip/app.py
@@ -67,10 +67,7 @@ def setup(self, spec: OperatorSpec):
         spec.input("in")
 
     def compute(self, op_input, op_output, context):
-        input = op_input.receive("in")
-
-        message = input[0].copy()
-        message.append(input[1])
+        message = op_input.receive("in")
 
         # Reference: Cosine Similarity https://docs.nvidia.com/nim/nvclip/latest/getting-started.html#cosine-similarity
         # Calculate cosine similarity between images and text
@@ -93,7 +90,7 @@ def compute(self, op_input, op_output, context):
                 f"Image {i+1}": round(float(d), 2) * 100 for i, d in enumerate(probabilities)
             }
 
-            print(f"\nPrompt: {input[1]}")
+            print(f"\nPrompt: {message[-1]}")
             print("Output:")
             for prob in probabilities:
                 print(f"{prob}: {probabilities[prob]}%")
@@ -114,9 +111,8 @@ def compute(self, op_input, op_output, context):
 
 
 class ExamplesOp(Operator):
-    def __init__(self, fragment, *args, spinner, example, **kwargs):
+    def __init__(self, fragment, *args, spinner, **kwargs):
         self.spinner = spinner
-        self.example = example
 
         # Need to call the base class constructor last
         super().__init__(fragment, *args, **kwargs)
@@ -126,37 +122,44 @@ def setup(self, spec: OperatorSpec):
 
     def compute(self, op_input, op_output, context):
         user_images = []
-        user_text = ""
-        use_example = False
+        user_prompt = ""
+
         while True:
-            print("\nEnter prompt or hit Enter to use default example: ", end="")
-            user_input = sys.stdin.readline().strip()
-            if user_input != "":
-                user_text = user_input
+            if len(user_images) == 0:
+                print("\nEnter a URL to an image: ", end="")
             else:
-                use_example = True
-                user_images = self.example["images"]
-                user_text = self.example["text"]
-            break
+                print("\nEnter a URL to another image or hit ENTER to continue: ", end="")
 
-        while True and not use_example:
-            print("\nEnter URL to an image or hit Enter to send the request: ", end="")
             user_input = sys.stdin.readline().strip()
             if user_input == "":
                 break
-            user_images.append(self.pre_process_input(user_input))
+
+            image_data = self.pre_process_input(user_input)
+            if image_data:
+                user_images.append(image_data)
+
+        while True:
+            print("\nEnter a prompt: ", end="")
+            user_input = sys.stdin.readline().strip()
+            if user_input != "":
+                user_prompt = user_input
+                break
 
         self.spinner.start()
-        op_output.emit((user_images, user_text), "out")
+
+        message = user_images + [user_prompt]
+
+        op_output.emit(message, "out")
 
     def pre_process_input(self, data):
         prepared_request = PreparedRequest()
         try:
             prepared_request.prepare_url(data, None)
             print("Downloading image...")
             return base64.b64encode(requests.get(prepared_request.url).content).decode("utf-8")
-        except Exception:
-            return data
+        except Exception as e:
+            logger.error("Error downloading image: %s", str(e))
+            return None
 
 
 class NVClipNIMApp(Application):
@@ -173,7 +176,6 @@ def compose(self):
             self,
             name="input",
             spinner=spinner,
-            example=self.kwargs("example"),
         )
         chat_op = OpenAIOperator(
             self,
diff --git a/applications/nvidia_nim/nvidia_nim_nvclip/launch_nim_from_ngc.sh b/applications/nvidia_nim/nvidia_nim_nvclip/launch_nim_from_ngc.sh
@@ -0,0 +1,41 @@
+#!/bin/bash
+# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# For usage and instructions, visit NV-CLIP at https://docs.nvidia.com/nim/nvclip/
+
+# Choose a container name for bookkeeping
+export CONTAINER_NAME=nvclip
+
+# The container name from the previous ngc registry image list command
+Repository=nvclip
+Latest_Tag=1.0.0
+
+# Choose a NV-CLIP NIM Image from NGC
+export IMG_NAME="nvcr.io/nim/nvidia/${Repository}:${Latest_Tag}"
+
+# Choose a path on your system to cache the downloaded models
+export LOCAL_NIM_CACHE=~/.cache/nim
+mkdir -p "$LOCAL_NIM_CACHE"
+
+# Start the NV-CLIP NIM
+docker run -it --rm --name=$CONTAINER_NAME \
+  --runtime=nvidia \
+  --gpus all \
+  -e NGC_API_KEY=$NGC_API_KEY \
+  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
+  -u $(id -u) \
+  -p 8000:8000 \
+  $IMG_NAME
diff --git a/applications/nvidia_nim/nvidia_nim_nvclip/metadata.json b/applications/nvidia_nim/nvidia_nim_nvclip/metadata.json
@@ -16,7 +16,8 @@
 			"minimum_required_version": "1.0.3",
 			"tested_versions": [
 				"1.0.3",
-				"2.1.0"
+				"2.1.0",
+				"2.5.0"
 			]
 		},
 		"platforms": [
diff --git a/applications/nvidia_nim/nvidia_nim_nvclip/nvidia_nim.yaml b/applications/nvidia_nim/nvidia_nim_nvclip/nvidia_nim.yaml
diff --git a/run b/run
@@ -469,8 +469,7 @@ lint() {
       # Fix python black code formatting issues, run:
       run_command black ${DIR_TO_RUN} || exit_code=1
       run_command codespell -w -i 3 ${DIR_TO_RUN} --ignore-words codespell_ignore_words.txt \
-                            --skip="*.onnx,*.min.js,*.min.js.map,Contrastive_learning_Notebook.ipynb,./data" \
-                            --skip="./applications/nvidia_nim/nvidia_nim_nvclip/nvidia_nim.yaml" || exit_code=1
+                            --skip="*.onnx,*.min.js,*.min.js.map,Contrastive_learning_Notebook.ipynb,./data" || exit_code=1
 
       # Fix cpplint with clang
       files_to_fix=`set -o pipefail; ${HOLOHUB_PY_EXE} -m cpplint \
@@ -516,7 +515,6 @@ lint() {
 
     echo "Code spelling"
     run_command codespell $DIR_TO_RUN --skip="*.onnx,*.min.js,*.min.js.map,Contrastive_learning_Notebook.ipynb,./data" \
-                          --skip="./applications/nvidia_nim/nvidia_nim_nvclip/nvidia_nim.yaml" \
                           --ignore-words codespell_ignore_words.txt \
                           --exclude-file codespell.txt || exit_code=1