Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a new project called "Object Detector" #741

Merged
merged 6 commits into from
Nov 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions Generative Models/Object Detector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Object Detector
## Description
The Object Detector is a computer vision project that uses deep learning algorithms to detect and identify objects in images and videos. This project can be used for a variety of applications, such as security monitoring, autonomous vehicles, and smart home systems.

## Features

1.Supports detection of multiple object classes in a single image or video frame

2.Provides bounding boxes and class labels for each detected object

3.Utilizes a pre-trained deep learning model for fast and accurate object detection

4.Allows for custom training of the object detection model on new datasets

5.Provides an easy-to-use Python API for integrating the object detector into your own projects

## Getting Started

**Prerequisites**

Python 3.6 or higher

TensorFlow 2.x or PyTorch 1.x

OpenCV

## Installation

1.Clone the repository:


git clone https://github.com/NANDAGOPALNG/ML-Nexus/tree/main/Generative%20Models/Object%20Detector

2.Install the required dependencies:


pip install -r requirements.txt

## Usage

1.Import the object detector module:
python


from object_detector import ObjectDetector

2.Create an instance of the object detector:
python


detector = ObjectDetector()

3.Detect objects in an image:
python


image = cv2.imread('image.jpg')
detections = detector.detect(image)

4.Visualize the detected objects:
python


for detection in detections:

x, y, w, h = detection['bbox']

label = detection['label']

cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)

cv2.putText(image, label, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (36,255,12), 2)

cv2.imshow('Object Detection', image)

cv2.waitKey(0)

## Contributing
We welcome contributions to the Object Detector project. If you would like to contribute, please follow these steps:

1.Fork the repository

2.Create a new branch for your feature or bug fix

3.Make your changes and commit them

4.Push your changes to your forked repository

5.Submit a pull request to the main repository

## License
This project is licensed under the MIT License.
82 changes: 82 additions & 0 deletions Generative Models/Object Detector/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
import gradio as gr
from PIL import Image, ImageDraw, ImageFont


# Use a pipeline as a high-level helper
from transformers import pipeline

# model_path = ("../Models/models--facebook--detr-resnet-50/snapshots"
# "/1d5f47bd3bdd2c4bbfa585418ffe6da5028b4c0b")

object_detector = pipeline("object-detection",
model="facebook/detr-resnet-50")

# object_detector = pipeline("object-detection",
# model=model_path)


def draw_bounding_boxes(image, detections, font_path=None, font_size=20):
"""
Draws bounding boxes on the given image based on the detections.
:param image: PIL.Image object
:param detections: List of detection results, where each result is a dictionary containing
'score', 'label', and 'box' keys. 'box' itself is a dictionary with 'xmin',
'ymin', 'xmax', 'ymax'.
:param font_path: Path to the TrueType font file to use for text.
:param font_size: Size of the font to use for text.
:return: PIL.Image object with bounding boxes drawn.
"""
# Make a copy of the image to draw on
draw_image = image.copy()
draw = ImageDraw.Draw(draw_image)

# Load custom font or default font if path not provided
if font_path:
font = ImageFont.truetype(font_path, font_size)
else:
# When font_path is not provided, load default font but it's size is fixed
font = ImageFont.load_default()
# Increase font size workaround by using a TTF font file, if needed, can download and specify the path

for detection in detections:
box = detection['box']
xmin = box['xmin']
ymin = box['ymin']
xmax = box['xmax']
ymax = box['ymax']

# Draw the bounding box
draw.rectangle([(xmin, ymin), (xmax, ymax)], outline="red", width=3)

# Optionally, you can also draw the label and score
label = detection['label']
score = detection['score']
text = f"{label} {score:.2f}"

# Draw text with background rectangle for visibility
if font_path: # Use the custom font with increased size
text_size = draw.textbbox((xmin, ymin), text, font=font)
else:
# Calculate text size using the default font
text_size = draw.textbbox((xmin, ymin), text)

draw.rectangle([(text_size[0], text_size[1]), (text_size[2], text_size[3])], fill="red")
draw.text((xmin, ymin), text, fill="white", font=font)

return draw_image


def detect_object(image):
raw_image = image
output = object_detector(raw_image)
processed_image = draw_bounding_boxes(raw_image, output)
return processed_image

demo = gr.Interface(fn=detect_object,
inputs=[gr.Image(label="Select Image",type="pil")],
outputs=[gr.Image(label="Processed Image", type="pil")],
title="@GenAILearniverse Project 6: Object Detector",
description="THIS APPLICATION WILL BE USED TO DETECT OBJECTS INSIDE THE PROVIDED INPUT IMAGE.")
demo.launch()

# print(output)
4 changes: 4 additions & 0 deletions Generative Models/Object Detector/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
transformers
torch
gradio
timm
72 changes: 44 additions & 28 deletions Generative Models/Turn Images Into Story with AI/README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,58 @@
**Turn Images into Story with AI**
# Turn Images into Story with AI


*Project Overview*
## Project Overview

This project uses artificial intelligence to generate stories from images. The goal is to create a system that can take an image as input and produce a coherent and engaging story as output.

*How it Works*
## How it Works

Image Input: The user uploads an image to the system.
Image Analysis: The system uses computer vision techniques to analyze the image and extract relevant features such as objects, scenes, and actions.
Story Generation: The system uses a natural language processing (NLP) model to generate a story based on the extracted features.
Post-processing: The system refines the generated story to ensure coherence, grammar, and readability.
Features
Image Upload: Users can upload images in various formats (e.g., JPEG, PNG, GIF).
Story Generation: The system generates a story based on the uploaded image.
Story Customization: Users can customize the story by selecting from various genres, tones, and styles.
Image-Story Pairing: The system allows users to view the original image alongside the generated story.
**Image Input**: The user uploads an image to the system.

*Technical Requirements*
**Image Analysis**: The system uses computer vision techniques to analyze the image and extract relevant features such as objects, scenes, and actions.

Programming Languages: Python 3.x, JavaScript (for frontend)
Frameworks: TensorFlow, Keras (for image analysis and NLP), React (for frontend)
Libraries: OpenCV, Pillow (for image processing), NLTK, spaCy (for NLP)
Hardware: GPU acceleration recommended for faster image processing and story generation
Installation
Clone the repository: git clone https://github.com/NANDAGOPALNG/ML-Nexus/tree/main/Generative%20Models/Turn%20Images%20Into%20Story%20with%20AI
Install dependencies: pip install -r requirements.txt
Set up the environment: python setup.py
Run the application: python app.py
**Story Generation**: The system uses a natural language processing (NLP) model to generate a story based on the extracted features.

*Usage*
**Post-processing**: The system refines the generated story to ensure coherence, grammar, and readability.

**Features**

**Image Upload**: Users can upload images in various formats (e.g., JPEG, PNG, GIF).

**Story Generation**: The system generates a story based on the uploaded image.

**Story Customization**: Users can customize the story by selecting from various genres, tones, and styles.
*
**Image-Story Pairing**: The system allows users to view the original image alongside the generated story.

## Technical Requirements

**Programming Languages**: Python 3.x, JavaScript (for frontend)

**Frameworks**: TensorFlow, Keras (for image analysis and NLP), React (for frontend)

**Libraries**: OpenCV, Pillow (for image processing), NLTK, spaCy (for NLP)

**Hardware**: GPU acceleration recommended for faster image processing and story generation

**Installation**

**Clone the repository**: git clone https://github.com/NANDAGOPALNG/ML-Nexus/tree/main/Generative%20Models/Turn%20Images%20Into%20Story%20with%20AI

**Install dependencies**: pip install -r requirements.txt

**Set up the environment**: python setup.py

**Run the application**: python app.py

## Usage

Upload an image to the system.
Select the desired story genre, tone, and style (optional).
Click the "Generate Story" button.
View the generated story alongside the original image.

*Contributing*
## Contributing

Contributions are welcome! If you'd like to contribute to this project, please:

Expand All @@ -45,14 +61,14 @@ Create a new branch for your feature or bug fix.
Commit your changes with a descriptive message.
Open a pull request.

*License*
## License
This project is licensed under the MIT License. See the LICENSE file for details.

*Acknowledgments*
## Acknowledgments

This project was inspired by the work of [insert inspiration or reference]. We acknowledge the contributions of the open-source community and the developers of the libraries and frameworks used in this project.
We acknowledge the contributions of the open-source community and the developers of the libraries and frameworks used in this project.

*Future Work*
## Future Work

Improving Story Quality: Refine the NLP model to generate more coherent and engaging stories.
Expanding Image Support: Add support for more image formats and sizes.
Expand Down
Loading