THE GUIDE

The "Tools" You Need

import cv2:

This line brings in a powerful library called OpenCV, which is like a toolbox for working with pictures and videos.

import mediapipe as mp:

This line gets another toolbox called MediaPipe, which is great at understanding parts of the human body, like hands and faces.

from ultralytics import YOLO:

This line gets the main "detective" tool, YOLO, which is really good at finding many different objects at once.

Setting Up the "Detectives"

yolo_model = YOLO("yolov8n.pt"):

This line creates the YOLO detective. It's like giving it a manual

("yolov8n.pt") so it knows what things to look for.

mp_hands = mp.solutions.hands:

This line gets the hand-detecting part of the MediaPipe toolbox.

mp_face = mp.solutions.face_mesh:

This line gets the face-detecting part.

hands = mp_hands.Hands(...):

This line turns on the hand detective. The numbers inside the parentheses tell it how confident it needs to be to say, "Hey, I found a hand!"

face_mesh = mp_face.FaceMesh(...):

This line turns on the face detective, with its own confidence level.

mp_draw = mp.solutions.drawing_utils:

This line gets a special drawing tool that helps draw the lines and dots on the hands and faces that the other detectives find.

The "Watch List"

common_objects = ['book', ... 'clock']:

This is a list of everyday things the code should pay special attention to, like a shopping list for the detective.

color_map = {...}:

This is a small color dictionary. It tells the code what color to use for certain important things it finds, like a yellow box for a person and a blue box for a hand.

Starting the "Investigation"

cap = cv2.VideoCapture(0): This line turns on your computer's camera. The 0 means it will use the default camera.

while True::

This line starts an endless loop. It tells the code to keep doing the same thing over and over again—forever, or until you tell it to stop.

In the Endless Loop

ret, frame = cap.read():

This line is like taking a single photo from your camera. The photo is called a frame. ret is a special signal that tells you if the photo was taken successfully.

if not ret: break:

If the camera fails to take a photo (e.g., it's unplugged), this line tells the code to stop the loop and end the program.

frame = cv2.resize(...):

This line makes the photo smaller so the computer can work with it faster.

rgb = cv2.cvtColor(...):

Computers see colors differently than humans. This line changes the photo's colors to a format the detectives can understand.

The Detective at Work

results = yolo_model(frame, ...)[0]:

This is where the YOLO detective looks at the photo and tries to find all the objects on its "watch list."

for box in results.boxes::

This line starts a new loop. It tells the code to look at each thing the YOLO detective found, one by one.

x1, y1, x2, y2 = map(int, box.xyxy[0].tolist()):

This line gets the numbers for the corners of the box that goes around the object. Think of it as getting the exact coordinates for where to draw the box.

cls = ..., conf = ..., class_name = ...:

These lines figure out what the object is (its class), how sure the detective is (confidence), and what to call it (the name).

if class_name in color_map: ...:

This part of the code is like a set of rules. It says, "If the thing you found is on the special color list, use that color. If it's on the list of common objects, use the 'default' color. If it's something else, use a gray color."

label = f"{class_name} {conf:.2f}":

This line creates the text for the label, like "laptop 0.95," which means it's 95% sure it's a laptop.

cv2.rectangle(...) and cv2.putText(...):

These two lines use the numbers and colors from before to actually draw the box and the label on the photo.

The Other Detectives

hand_results = hands.process(rgb):

This line sends the photo to the hand detective to see if it can find any hands.

if hand_results.multi_hand_landmarks::

If the hand detective finds a hand, this part of the code runs.

for handLms in ...:

This line looks at each hand it found.

mp_draw.draw_landmarks(...):

This line uses the drawing tool to draw all the little dots and lines on the hand.

The same process is repeated for the face detective: face_results = face_mesh.process(rgb), which looks for faces and draws the lines on them.

Showing the "Evidence"

cv2.imshow("...", frame):

This line shows you the photo with all the boxes and lines drawn on it in a window on your screen. The window's name is "Advanced Object Tracker."

if cv2.waitKey(1) & 0xFF == ord('q')::

This is the exit switch. It tells the code to check if you have pressed the 'q' key.

break:

If you press 'q', this line tells the code to stop the endless loop and go to the final part.

The Cleanup

cap.release():

This line tells your computer to turn off the camera.

cv2.destroyAllWindows():

This line closes the window that was showing you the photos. The program is now finished.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
main.py		main.py
sign_language.py		sign_language.py
yolov8n.pt		yolov8n.pt

anagh-malik/computer-vision

Folders and files

Latest commit

History

Repository files navigation