Merge pull request #741 from NANDAGOPALNG/main

Added a new project called "Object Detector"
UppuluriKalyani · Nov 3, 2024 · 1ef4c6b · 1ef4c6b
2 parents 648eb07 + cb0a38d
commit 1ef4c6b
Show file tree

Hide file tree

Showing 4 changed files with 222 additions and 28 deletions.
diff --git a/Generative Models/Object Detector/README.md b/Generative Models/Object Detector/README.md
@@ -0,0 +1,92 @@
+# Object Detector
+## Description
+The Object Detector is a computer vision project that uses deep learning algorithms to detect and identify objects in images and videos. This project can be used for a variety of applications, such as security monitoring, autonomous vehicles, and smart home systems.
+
+## Features
+
+1.Supports detection of multiple object classes in a single image or video frame
+
+2.Provides bounding boxes and class labels for each detected object
+
+3.Utilizes a pre-trained deep learning model for fast and accurate object detection
+
+4.Allows for custom training of the object detection model on new datasets
+
+5.Provides an easy-to-use Python API for integrating the object detector into your own projects
+
+## Getting Started
+
+**Prerequisites**
+
+Python 3.6 or higher
+
+TensorFlow 2.x or PyTorch 1.x
+
+OpenCV
+
+## Installation
+
+1.Clone the repository:
+
+
+git clone https://github.com/NANDAGOPALNG/ML-Nexus/tree/main/Generative%20Models/Object%20Detector
+
+2.Install the required dependencies:
+
+
+pip install -r requirements.txt
+
+## Usage
+
+1.Import the object detector module:
+python
+
+
+from object_detector import ObjectDetector
+
+2.Create an instance of the object detector:
+python
+
+
+detector = ObjectDetector()
+
+3.Detect objects in an image:
+python
+
+
+image = cv2.imread('image.jpg')
+detections = detector.detect(image)
+
+4.Visualize the detected objects:
+python
+
+
+for detection in detections:
+
+    x, y, w, h = detection['bbox']
+
+    label = detection['label']
+
+    cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)
+
+    cv2.putText(image, label, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (36,255,12), 2)
+
+cv2.imshow('Object Detection', image)
+
+cv2.waitKey(0)
+
+## Contributing
+We welcome contributions to the Object Detector project. If you would like to contribute, please follow these steps:
+
+1.Fork the repository
+
+2.Create a new branch for your feature or bug fix
+
+3.Make your changes and commit them
+
+4.Push your changes to your forked repository
+
+5.Submit a pull request to the main repository
+
+## License
+This project is licensed under the MIT License.
diff --git a/Generative Models/Object Detector/app.py b/Generative Models/Object Detector/app.py
@@ -0,0 +1,82 @@
+import gradio as gr
+from PIL import Image, ImageDraw, ImageFont
+
+
+# Use a pipeline as a high-level helper
+from transformers import pipeline
+
+# model_path = ("../Models/models--facebook--detr-resnet-50/snapshots"
+#               "/1d5f47bd3bdd2c4bbfa585418ffe6da5028b4c0b")
+
+object_detector = pipeline("object-detection",
+                model="facebook/detr-resnet-50")
+
+# object_detector = pipeline("object-detection",
+#                 model=model_path)
+
+
+def draw_bounding_boxes(image, detections, font_path=None, font_size=20):
+    """
+    Draws bounding boxes on the given image based on the detections.
+    :param image: PIL.Image object
+    :param detections: List of detection results, where each result is a dictionary containing
+                       'score', 'label', and 'box' keys. 'box' itself is a dictionary with 'xmin',
+                       'ymin', 'xmax', 'ymax'.
+    :param font_path: Path to the TrueType font file to use for text.
+    :param font_size: Size of the font to use for text.
+    :return: PIL.Image object with bounding boxes drawn.
+    """
+    # Make a copy of the image to draw on
+    draw_image = image.copy()
+    draw = ImageDraw.Draw(draw_image)
+
+    # Load custom font or default font if path not provided
+    if font_path:
+        font = ImageFont.truetype(font_path, font_size)
+    else:
+        # When font_path is not provided, load default font but it's size is fixed
+        font = ImageFont.load_default()
+        # Increase font size workaround by using a TTF font file, if needed, can download and specify the path
+
+    for detection in detections:
+        box = detection['box']
+        xmin = box['xmin']
+        ymin = box['ymin']
+        xmax = box['xmax']
+        ymax = box['ymax']
+
+        # Draw the bounding box
+        draw.rectangle([(xmin, ymin), (xmax, ymax)], outline="red", width=3)
+
+        # Optionally, you can also draw the label and score
+        label = detection['label']
+        score = detection['score']
+        text = f"{label} {score:.2f}"
+
+        # Draw text with background rectangle for visibility
+        if font_path:  # Use the custom font with increased size
+            text_size = draw.textbbox((xmin, ymin), text, font=font)
+        else:
+            # Calculate text size using the default font
+            text_size = draw.textbbox((xmin, ymin), text)
+
+        draw.rectangle([(text_size[0], text_size[1]), (text_size[2], text_size[3])], fill="red")
+        draw.text((xmin, ymin), text, fill="white", font=font)
+
+    return draw_image
+
+
+def detect_object(image):
+    raw_image = image
+    output = object_detector(raw_image)
+    processed_image = draw_bounding_boxes(raw_image, output)
+    return processed_image
+
+demo = gr.Interface(fn=detect_object,
+                    inputs=[gr.Image(label="Select Image",type="pil")],
+                    outputs=[gr.Image(label="Processed Image", type="pil")],
+                    title="@GenAILearniverse Project 6: Object Detector",
+                    description="THIS APPLICATION WILL BE USED TO DETECT OBJECTS INSIDE THE PROVIDED INPUT IMAGE.")
+demo.launch()
+
+# print(output)
diff --git a/Generative Models/Object Detector/requirements.txt b/Generative Models/Object Detector/requirements.txt
@@ -0,0 +1,4 @@
+transformers
+torch
+gradio
+timm
diff --git a/Generative Models/Turn Images Into Story with AI/README.md b/Generative Models/Turn Images Into Story with AI/README.md
@@ -1,42 +1,58 @@
-**Turn Images into Story with AI**
+# Turn Images into Story with AI
 
 
-*Project Overview*
+## Project Overview
 
 This project uses artificial intelligence to generate stories from images. The goal is to create a system that can take an image as input and produce a coherent and engaging story as output.
 
-*How it Works*
+## How it Works
 
-Image Input: The user uploads an image to the system.
-Image Analysis: The system uses computer vision techniques to analyze the image and extract relevant features such as objects, scenes, and actions.
-Story Generation: The system uses a natural language processing (NLP) model to generate a story based on the extracted features.
-Post-processing: The system refines the generated story to ensure coherence, grammar, and readability.
-Features
-Image Upload: Users can upload images in various formats (e.g., JPEG, PNG, GIF).
-Story Generation: The system generates a story based on the uploaded image.
-Story Customization: Users can customize the story by selecting from various genres, tones, and styles.
-Image-Story Pairing: The system allows users to view the original image alongside the generated story.
+**Image Input**: The user uploads an image to the system.
 
-*Technical Requirements*
+**Image Analysis**: The system uses computer vision techniques to analyze the image and extract relevant features such as objects, scenes, and actions.
 
-Programming Languages: Python 3.x, JavaScript (for frontend)
-Frameworks: TensorFlow, Keras (for image analysis and NLP), React (for frontend)
-Libraries: OpenCV, Pillow (for image processing), NLTK, spaCy (for NLP)
-Hardware: GPU acceleration recommended for faster image processing and story generation
-Installation
-Clone the repository: git clone https://github.com/NANDAGOPALNG/ML-Nexus/tree/main/Generative%20Models/Turn%20Images%20Into%20Story%20with%20AI
-Install dependencies: pip install -r requirements.txt
-Set up the environment: python setup.py
-Run the application: python app.py
+**Story Generation**: The system uses a natural language processing (NLP) model to generate a story based on the extracted features.
 
-*Usage*
+**Post-processing**: The system refines the generated story to ensure coherence, grammar, and readability.
+
+**Features**
+
+**Image Upload**: Users can upload images in various formats (e.g., JPEG, PNG, GIF).
+
+**Story Generation**: The system generates a story based on the uploaded image.
+
+**Story Customization**: Users can customize the story by selecting from various genres, tones, and styles.
+*
+**Image-Story Pairing**: The system allows users to view the original image alongside the generated story.
+
+## Technical Requirements
+
+**Programming Languages**: Python 3.x, JavaScript (for frontend)
+
+**Frameworks**: TensorFlow, Keras (for image analysis and NLP), React (for frontend)
+
+**Libraries**: OpenCV, Pillow (for image processing), NLTK, spaCy (for NLP)
+
+**Hardware**: GPU acceleration recommended for faster image processing and story generation
+
+**Installation**
+
+**Clone the repository**: git clone https://github.com/NANDAGOPALNG/ML-Nexus/tree/main/Generative%20Models/Turn%20Images%20Into%20Story%20with%20AI
+
+**Install dependencies**: pip install -r requirements.txt
+
+**Set up the environment**: python setup.py
+
+**Run the application**: python app.py
+
+## Usage
 
 Upload an image to the system.
 Select the desired story genre, tone, and style (optional).
 Click the "Generate Story" button.
 View the generated story alongside the original image.
 
-*Contributing*
+## Contributing
 
 Contributions are welcome! If you'd like to contribute to this project, please:
 
@@ -45,14 +61,14 @@ Create a new branch for your feature or bug fix.
 Commit your changes with a descriptive message.
 Open a pull request.
 
-*License*
+## License
 This project is licensed under the MIT License. See the LICENSE file for details.
 
-*Acknowledgments*
+## Acknowledgments
 
-This project was inspired by the work of [insert inspiration or reference]. We acknowledge the contributions of the open-source community and the developers of the libraries and frameworks used in this project.
+We acknowledge the contributions of the open-source community and the developers of the libraries and frameworks used in this project.
 
-*Future Work*
+## Future Work
 
 Improving Story Quality: Refine the NLP model to generate more coherent and engaging stories.
 Expanding Image Support: Add support for more image formats and sizes.