biodatlab
diff --git a/‎README.md
Lines changed: 7 additions & 0 deletions b/‎README.md
Lines changed: 7 additions & 0 deletions
diff --git a/‎assests/Moodeng-Kalamang.jpg
331 KB b/‎assests/Moodeng-Kalamang.jpg
331 KB
diff --git a/‎assests/Moodeng-Llava.png
918 KB b/‎assests/Moodeng-Llava.png
918 KB
diff --git a/‎assests/Sailor-Moon-Generated.jpg
590 KB b/‎assests/Sailor-Moon-Generated.jpg
590 KB
diff --git a/‎assests/create-new-space.png
121 KB b/‎assests/create-new-space.png
121 KB
diff --git a/‎assests/fill-information.png
251 KB b/‎assests/fill-information.png
251 KB
diff --git a/‎ml-app-gradio/Gradio_Examples.ipynb
Lines changed: 348 additions & 0 deletions b/‎ml-app-gradio/Gradio_Examples.ipynb
Lines changed: 348 additions & 0 deletions
@@ -7,6 +7,13 @@ This is part of the code for BME Lab at Mahidol University.
 Gradio is a powerful tool for creating web interfaces for machine learning models.
 In this lab, students will learn to build practical medical applications using Gradio's intuitive interface.
 
+<p align="center">
+  <img src="assests/Moodeng-Llava.png" width="500"/><br>
+  <caption>
+    Screencapture of Gradio Application for image captioning using Llava 1.5b
+  </caption>
+</p>
+
 ### Generative AI
 
 This lab will explore the practical applications of Large Language Models (LLMs) using Ollama and OpenWebUI.
 
@@ -0,0 +1,348 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "gpuType": "T4"
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    },
+    "accelerator": "GPU"
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# Build your ML application with Gradio\n",
+        "\n",
+        "Gradio is an open-source Python package that allows you to quickly build a demo or web application for your machine learning model, API, or any arbitrary Python function.\n",
+        "\n",
+        "See the documentation here: https://www.gradio.app/guides/quickstart"
+      ],
+      "metadata": {
+        "id": "1b0gyWaU8Tsa"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "%%capture\n",
+        "!pip install gradio\n",
+        "!pip install transformers"
+      ],
+      "metadata": {
+        "id": "5NIoUtTFAOj9"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Import libraries using in this notebook\n",
+        "import numpy as np\n",
+        "import gradio as gr"
+      ],
+      "metadata": {
+        "id": "Hp9QVonFzKSm"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Specifying the input types and the output types."
+      ],
+      "metadata": {
+        "id": "zpuy7NLrx6gY"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def greet(name, intensity):\n",
+        "    return \"Hello, \" + name + \"!\" * int(intensity)\n",
+        "\n",
+        "demo = gr.Interface(\n",
+        "    fn=greet,\n",
+        "    inputs=[\"text\", \"slider\"],\n",
+        "    outputs=[\"text\"],\n",
+        ")\n",
+        "\n",
+        "demo.launch()"
+      ],
+      "metadata": {
+        "id": "ILJkarqLxkDx"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "If you use actual classes for `gr.Textbox` and `gr.Slider` instead of the string shortcuts, you have access to much more customizability through component attributes."
+      ],
+      "metadata": {
+        "id": "1bKypbUIyGmU"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def greet(name, intensity):\n",
+        "    return \"Hello, \" + name + \"!\" * intensity\n",
+        "\n",
+        "demo = gr.Interface(\n",
+        "    fn=greet,\n",
+        "    inputs=[\"text\", gr.Slider(value=2, minimum=1, maximum=10, step=1)],\n",
+        "    outputs=[gr.Textbox(label=\"greeting\", lines=3)], # add number of textbox lines\n",
+        ")\n",
+        "\n",
+        "demo.launch()"
+      ],
+      "metadata": {
+        "id": "SofEy7jwyDF2"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "zs_olOJjxHp2"
+      },
+      "outputs": [],
+      "source": [
+        "def greet(name, intensity):\n",
+        "    return \"Hello, \" + name + \"!\" * intensity\n",
+        "\n",
+        "demo = gr.Interface(\n",
+        "    fn=greet,\n",
+        "    inputs=[\"text\", gr.Slider(value=2, minimum=1, maximum=10, step=1)],\n",
+        "    outputs=[gr.Textbox(label=\"greeting\", lines=3)],\n",
+        ")\n",
+        "\n",
+        "demo.launch()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def filter_sepia(input_img):\n",
+        "    sepia_filter = np.array([\n",
+        "        [0.393, 0.769, 0.189],\n",
+        "        [0.349, 0.686, 0.168],\n",
+        "        [0.272, 0.534, 0.131]\n",
+        "    ])\n",
+        "    sepia_img = input_img.dot(sepia_filter.T)\n",
+        "    sepia_img /= sepia_img.max()\n",
+        "    return sepia_img\n",
+        "\n",
+        "demo = gr.Interface(filter_sepia, gr.Image(), \"image\")\n",
+        "demo.launch()"
+      ],
+      "metadata": {
+        "id": "cIZ3P92Gyx92"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "See more examples at https://www.gradio.app/guides/the-interface-class"
+      ],
+      "metadata": {
+        "id": "DCLzjY-szB_6"
+      }
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Make your first Image-to-text with Gradio"
+      ],
+      "metadata": {
+        "id": "ER8CEY0Q8ev4"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from transformers import AutoProcessor, LlavaForConditionalGeneration\n",
+        "from PIL import Image\n",
+        "import torch\n",
+        "import numpy as np\n",
+        "import requests"
+      ],
+      "metadata": {
+        "id": "4S347URd1YwF"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Follow the documentation at https://huggingface.co/docs/transformers/en/model_doc/llava\n",
+        "model_name = \"llava-hf/llava-1.5-7b-hf\"\n",
+        "processor = AutoProcessor.from_pretrained(model_name)\n",
+        "model = LlavaForConditionalGeneration.from_pretrained(\n",
+        "    model_name,\n",
+        "    torch_dtype=torch.float16,\n",
+        "    device_map=\"auto\"\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "JlATDwdby24c"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "url = \"https://www.ilankelman.org/stopsigns/australia.jpg\"\n",
+        "image_stop = Image.open(requests.get(url, stream=True).raw)\n",
+        "conversation = [\n",
+        "    {\n",
+        "        \"role\": \"user\",\n",
+        "        \"content\": [\n",
+        "            {\"type\": \"image\"},\n",
+        "            {\"type\": \"text\", \"text\": \"What is shown in this image?\"},\n",
+        "        ],\n",
+        "    },\n",
+        "]\n",
+        "prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)\n",
+        "\n",
+        "\n",
+        "# Process the image and prompt\n",
+        "inputs = processor(\n",
+        "    images=[image_stop],\n",
+        "    text=[prompt],\n",
+        "    return_tensors=\"pt\"\n",
+        ").to(device=\"cuda\", dtype=torch.float16)\n",
+        "\n",
+        "\n",
+        "generate_ids = model.generate(\n",
+        "    **inputs,\n",
+        "    do_sample=True,\n",
+        "    max_new_tokens=100\n",
+        ")\n",
+        "processor.batch_decode(generate_ids, skip_special_tokens=True)"
+      ],
+      "metadata": {
+        "id": "lasxF1d51tUm"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Now, let's put everything into one function and then test our function"
+      ],
+      "metadata": {
+        "id": "-Ya1BWfP8nF-"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "def generate_description(image, prompt = \"What is shown in this image?\", max_new_tokens=200):\n",
+        "    conversation = [\n",
+        "        {\n",
+        "            \"role\": \"user\",\n",
+        "            \"content\": [\n",
+        "                {\"type\": \"image\"},\n",
+        "                {\"type\": \"text\", \"text\": prompt},\n",
+        "            ],\n",
+        "        },\n",
+        "    ]\n",
+        "    prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)\n",
+        "    inputs = processor(\n",
+        "        images=[image],\n",
+        "        text=[prompt],\n",
+        "        return_tensors=\"pt\"\n",
+        "    ).to(device=\"cuda\", dtype=torch.float16)\n",
+        "    generate_ids = model.generate(\n",
+        "        **inputs,\n",
+        "        do_sample=True,\n",
+        "        max_new_tokens=max_new_tokens\n",
+        "    )\n",
+        "    generated_description = processor.batch_decode(generate_ids, skip_special_tokens=True)\n",
+        "    return generated_description[0]"
+      ],
+      "metadata": {
+        "id": "75qgk39W5k_4"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Test the function that we just build\n",
+        "image = Image.open(\"/content/466029110_1113670126421850_13431688209473903_n.jpg\")\n",
+        "generate_description(\n",
+        "    image,\n",
+        "    \"What is shown in this image?\"\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "6QzFDNPl6hoQ"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Then serve using Gradio. `input` will be images and textbox (prompt) and output will be text (description of the text)"
+      ],
+      "metadata": {
+        "id": "B3soE-kP8sxh"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import gradio as gr\n",
+        "\n",
+        "demo = gr.Interface(\n",
+        "    fn=lambda img, prompt: generate_description(img, prompt),\n",
+        "    inputs=[gr.Image(type=\"pil\"),\n",
+        "            gr.Textbox(label=\"prompt\", value=\"What is shown in this image?\", lines=3)],  # Changed to numpy\n",
+        "    outputs=[gr.Textbox(label=\"Description\", lines=3)],\n",
+        "    title=\"Image Description using LLaVA\",\n",
+        "    description=\"Upload an image to get a detailed description using LLaVA-1.5-7b\",\n",
+        ")\n",
+        "demo.launch()"
+      ],
+      "metadata": {
+        "id": "B_WndZzw1-ee"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# We can leave a lot of port open. So don't forget to close all the port using `gr.close_all()`\n",
+        "gr.close_all()"
+      ],
+      "metadata": {
+        "id": "hSzmRfxD6HuD"
+      },
+      "execution_count": null,
+      "outputs": []
+    }
+  ]
+}