Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outdated code changes #169

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 27 additions & 134 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,143 +1,36 @@
# yolov4-deepsort
[![license](https://img.shields.io/github/license/mashape/apistatus.svg)](LICENSE)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zmeSTP3J5zu2d5fHgsQC06DyYEYJFXq1?usp=sharing)
# Speed Detection Project

Object tracking implemented with YOLOv4, DeepSort, and TensorFlow. YOLOv4 is a state of the art algorithm that uses deep convolutional neural networks to perform object detections. We can take the output of YOLOv4 feed these object detections into Deep SORT (Simple Online and Realtime Tracking with a Deep Association Metric) in order to create a highly accurate object tracker.
## Overview
This project aims to detect and track objects in a video stream and then estimate their speed based on their movement. The project is divided into three main phases: object detection using YOLO v4, object tracking using DeepSORT, and speed calculation based on object movement between two horizontal lines in the video.

## Demo of Object Tracker on Persons
<p align="center"><img src="data/helpers/demo.gif"\></p>
## Object Detection
In this phase, YOLO v4, a state-of-the-art object detection algorithm, is used to identify objects of interest in each frame of the video. YOLO v4 provides bounding boxes around detected objects along with their class labels.

## Demo of Object Tracker on Cars
<p align="center"><img src="data/helpers/cars.gif"\></p>
## Object Tracking
The object tracking phase employs DeepSORT (Deep Simple Online and Realtime Tracking) algorithm to track the detected objects across consecutive frames. DeepSORT enables accurate and real-time tracking of objects, providing their trajectories over time.

## Getting Started
To get started, install the proper dependencies either via Anaconda or Pip. I recommend Anaconda route for people using a GPU as it configures CUDA toolkit version for you.
## Speed Calculation
Speed is calculated by measuring the time taken for an object to traverse between two horizontal lines marked in the video. The distance between these lines is known. By analyzing the displacement of objects between frames and the time elapsed, we can calculate their speed using the formula: speed = distance / time.

### Conda (Recommended)
## Usage
1. **Data Preparation**: Ensure that your video data is accessible and suitable for processing.
2. **Object Detection**: Run YOLO v4 to detect objects in each frame.
3. **Object Tracking**: Apply DeepSORT algorithm to track the detected objects across frames.
4. **Speed Calculation**: Identify two horizontal lines in the video between which the distance is known. Measure the time taken for the object to cover this distance.
5. **Speed Estimation**: Utilize the measured time and known distance to estimate the speed of the tracked objects.

```bash
# Tensorflow CPU
conda env create -f conda-cpu.yml
conda activate yolov4-cpu
## Requirements
- Python 3.x
- OpenCV
- YOLO v4 model
- DeepSORT algorithm
- Pre-trained weights for YOLO v4 and DeepSORT (or train your own models)
- Basic understanding of computer vision and deep learning concepts

# Tensorflow GPU
conda env create -f conda-gpu.yml
conda activate yolov4-gpu
```
## Credits
- YOLO v4: Credits to the developers and contributors of YOLO v4 for their outstanding object detection model.
- DeepSORT: Credits to the developers and contributors of DeepSORT for providing a robust object tracking algorithm.

### Pip
(TensorFlow 2 packages require a pip version >19.0.)
```bash
# TensorFlow CPU
pip install -r requirements.txt
## Disclaimer
This project is for educational and research purposes only. Usage of the project for any unlawful purposes is strictly prohibited. The developers of this project are not responsible for any misuse or illegal activities conducted using this software.

# TensorFlow GPU
pip install -r requirements-gpu.txt
```
### Nvidia Driver (For GPU, if you are not using Conda Environment and haven't set up CUDA yet)
Make sure to use CUDA Toolkit version 10.1 as it is the proper version for the TensorFlow version used in this repository.
https://developer.nvidia.com/cuda-10.1-download-archive-update2

## Downloading Official YOLOv4 Pre-trained Weights
Our object tracker uses YOLOv4 to make the object detections, which deep sort then uses to track. There exists an official pre-trained YOLOv4 object detector model that is able to detect 80 classes. For easy demo purposes we will use the pre-trained weights for our tracker.
Download pre-trained yolov4.weights file: https://drive.google.com/open?id=1cewMfusmPjYWbrnuJRuKhPMwRe_b9PaT

Copy and paste yolov4.weights from your downloads folder into the 'data' folder of this repository.

If you want to use yolov4-tiny.weights, a smaller model that is faster at running detections but less accurate, download file here: https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights

## Running the Tracker with YOLOv4
To implement the object tracking using YOLOv4, first we convert the .weights into the corresponding TensorFlow model which will be saved to a checkpoints folder. Then all we need to do is run the object_tracker.py script to run our object tracker with YOLOv4, DeepSort and TensorFlow.
```bash
# Convert darknet weights to tensorflow model
python save_model.py --model yolov4

# Run yolov4 deep sort object tracker on video
python object_tracker.py --video ./data/video/test.mp4 --output ./outputs/demo.avi --model yolov4

# Run yolov4 deep sort object tracker on webcam (set video flag to 0)
python object_tracker.py --video 0 --output ./outputs/webcam.avi --model yolov4
```
The output flag allows you to save the resulting video of the object tracker running so that you can view it again later. Video will be saved to the path that you set. (outputs folder is where it will be if you run the above command!)

If you want to run yolov3 set the model flag to ``--model yolov3``, upload the yolov3.weights to the 'data' folder and adjust the weights flag in above commands. (see all the available command line flags and descriptions of them in a below section)

## Running the Tracker with YOLOv4-Tiny
The following commands will allow you to run yolov4-tiny model. Yolov4-tiny allows you to obtain a higher speed (FPS) for the tracker at a slight cost to accuracy. Make sure that you have downloaded the tiny weights file and added it to the 'data' folder in order for commands to work!
```
# save yolov4-tiny model
python save_model.py --weights ./data/yolov4-tiny.weights --output ./checkpoints/yolov4-tiny-416 --model yolov4 --tiny

# Run yolov4-tiny object tracker
python object_tracker.py --weights ./checkpoints/yolov4-tiny-416 --model yolov4 --video ./data/video/test.mp4 --output ./outputs/tiny.avi --tiny
```

## Resulting Video
As mentioned above, the resulting video will save to wherever you set the ``--output`` command line flag path to. I always set it to save to the 'outputs' folder. You can also change the type of video saved by adjusting the ``--output_format`` flag, by default it is set to AVI codec which is XVID.

Example video showing tracking of all coco dataset classes:
<p align="center"><img src="data/helpers/all_classes.gif"\></p>

## Filter Classes that are Tracked by Object Tracker
By default the code is setup to track all 80 or so classes from the coco dataset, which is what the pre-trained YOLOv4 model is trained on. However, you can easily adjust a few lines of code in order to track any 1 or combination of the 80 classes. It is super easy to filter only the ``person`` class or only the ``car`` class which are most common.

To filter a custom selection of classes all you need to do is comment out line 159 and uncomment out line 162 of [object_tracker.py](https://github.com/theAIGuysCode/yolov4-deepsort/blob/master/object_tracker.py) Within the list ``allowed_classes`` just add whichever classes you want the tracker to track. The classes can be any of the 80 that the model is trained on, see which classes you can track in the file [data/classes/coco.names](https://github.com/theAIGuysCode/yolov4-deepsort/blob/master/data/classes/coco.names)

This example would allow the classes for person and car to be tracked.
<p align="center"><img src="data/helpers/filter_classes.PNG"\></p>

### Demo of Object Tracker set to only track the class 'person'
<p align="center"><img src="data/helpers/demo.gif"\></p>

### Demo of Object Tracker set to only track the class 'car'
<p align="center"><img src="data/helpers/cars.gif"\></p>

## Command Line Args Reference

```bash
save_model.py:
--weights: path to weights file
(default: './data/yolov4.weights')
--output: path to output
(default: './checkpoints/yolov4-416')
--[no]tiny: yolov4 or yolov4-tiny
(default: 'False')
--input_size: define input size of export model
(default: 416)
--framework: what framework to use (tf, trt, tflite)
(default: tf)
--model: yolov3 or yolov4
(default: yolov4)

object_tracker.py:
--video: path to input video (use 0 for webcam)
(default: './data/video/test.mp4')
--output: path to output video (remember to set right codec for given format. e.g. XVID for .avi)
(default: None)
--output_format: codec used in VideoWriter when saving video to file
(default: 'XVID)
--[no]tiny: yolov4 or yolov4-tiny
(default: 'false')
--weights: path to weights file
(default: './checkpoints/yolov4-416')
--framework: what framework to use (tf, trt, tflite)
(default: tf)
--model: yolov3 or yolov4
(default: yolov4)
--size: resize images to
(default: 416)
--iou: iou threshold
(default: 0.45)
--score: confidence threshold
(default: 0.50)
--dont_show: dont show video output
(default: False)
--info: print detailed info about tracked objects
(default: False)
```

### References

Huge shoutout goes to hunglc007 and nwojke for creating the backbones of this repository:
* [tensorflow-yolov4-tflite](https://github.com/hunglc007/tensorflow-yolov4-tflite)
* [Deep SORT Repository](https://github.com/nwojke/deep_sort)
Binary file added data/video/highway_mini.mp4
Binary file not shown.
2 changes: 1 addition & 1 deletion deep_sort/detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ class Detection(object):
"""

def __init__(self, tlwh, confidence, class_name, feature):
self.tlwh = np.asarray(tlwh, dtype=np.float)
self.tlwh = np.asarray(tlwh, dtype=float)
self.confidence = float(confidence)
self.class_name = class_name
self.feature = np.asarray(feature, dtype=np.float32)
Expand Down
2 changes: 1 addition & 1 deletion deep_sort/preprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def non_max_suppression(boxes, classes, max_bbox_overlap, scores=None):
if len(boxes) == 0:
return []

boxes = boxes.astype(np.float)
boxes = boxes.astype(float)
pick = []

x1 = boxes[:, 0]
Expand Down
68 changes: 67 additions & 1 deletion object_tracker.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,38 @@
flags.DEFINE_boolean('info', False, 'show detailed info of tracked objects')
flags.DEFINE_boolean('count', False, 'count objects being tracked on screen')

def get_speed(dist,frame1,frame2,fps):
t=frame2-frame1
speed=(dist*fps)/t
speed=speed*60*60
speed=speed/1000
return speed

def init_array(n,k):
arr=[]
j=0
while(j<n):
arr.append(k)
j=j+1
return arr

def print_speeds(max_v,max_tid,dist,enter_frame,exit_frame,mf,mfps):
j=0
while(j<max_v and j<=max_tid):
if(mf[j]==2):
speed=get_speed(dist,enter_frame[j],exit_frame[j],mfps)
print("speed of vehicle id : "+str(j) + " is " + str(speed) + "\n")
j=j+1


def main(_argv):
print("halo :D test 2\n")

#distance and pixel line declarations
dist=20
line1=433
line2=604

# Definition of the parameters
max_cosine_distance = 0.4
nn_budget = None
Expand Down Expand Up @@ -91,6 +122,22 @@ def main(_argv):
out = cv2.VideoWriter(FLAGS.output, codec, fps, (width, height))

frame_num = 0


# my_vars

width = int(vid.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT))
mfps = int(vid.get(cv2.CAP_PROP_FPS))
max_v=100000
enter_frame=init_array(max_v,-1)
exit_frame=init_array(max_v,-1)
mf=init_array(max_v,0)
prev=init_array(max_v,15000)
max_tid=-1

print("resolution is "+ str(height) + " x " + str(width) + "\n")
print("fps is " + str(mfps) + "\n")
# while video is running
while True:
return_value, frame = vid.read()
Expand Down Expand Up @@ -160,7 +207,7 @@ def main(_argv):
allowed_classes = list(class_names.values())

# custom allowed classes (uncomment line below to customize tracker for only people)
#allowed_classes = ['person']
allowed_classes = ['car','truck','bus','motorbike']

# loop through objects and use class index to get class name, allow only classes in allowed_classes list
names = []
Expand Down Expand Up @@ -213,11 +260,28 @@ def main(_argv):
cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), color, 2)
cv2.rectangle(frame, (int(bbox[0]), int(bbox[1]-30)), (int(bbox[0])+(len(class_name)+len(str(track.track_id)))*17, int(bbox[1])), color, -1)
cv2.putText(frame, class_name + "-" + str(track.track_id),(int(bbox[0]), int(bbox[1]-10)),0, 0.75, (255,255,255),2)
#draw lines
cv2.line(frame, (1, line1), (width-1, line1), (0, 0, 255), 2)
# cv2.putText(frame, ('LINE 1'), (172, 198), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1, cv2.LINE_AA)
cv2.line(frame, (1, line2), (width-1, line2), (255, 0, 0), 2)
# cv2.putText(frame, ('LINE 2'), (8, 268), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1, cv2.LINE_AA)

# if enable info flag then print details about each track
if FLAGS.info:
print("Tracker ID: {}, Class: {}, BBox Coords (xmin, ymin, xmax, ymax): {}".format(str(track.track_id), class_name, (int(bbox[0]), int(bbox[1]), int(bbox[2]), int(bbox[3]))))
tid=track.track_id
ymin=bbox[3]
if(tid<max_v):
if(mf[tid]==0 and ymin>=line1 and prev[tid]<=line1):
mf[tid]=1
enter_frame[tid]=frame_num
if(mf[tid]==1 and ymin>=line2 and prev[tid]<=line2):
mf[tid]=2
exit_frame[tid]=frame_num
max_tid=max(max_tid,tid)
prev[tid]=ymin


# calculate frames per second of running detections
fps = 1.0 / (time.time() - start_time)
print("FPS: %.2f" % fps)
Expand All @@ -231,6 +295,8 @@ def main(_argv):
if FLAGS.output:
out.write(result)
if cv2.waitKey(1) & 0xFF == ord('q'): break

print_speeds(max_v,max_tid,dist,enter_frame,exit_frame,mf,mfps)
cv2.destroyAllWindows()

if __name__ == '__main__':
Expand Down
47 changes: 47 additions & 0 deletions object_tracker_additions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
def get_speed(dist,frame1,frame2,fps):
t=frame2-frame1
speed=(dist*fps)/t
return speed

def init_array(n,k):
arr=[]
j=0
while(j<n):
arr.append(k)
j=j+1
return arr



mfps = int(vid.get(cv2.CAP_PROP_FPS))
max_v=100000
enter_frame=init_array(max_v,-1)
exit_frame=init_array(max_v,-1)
mf=init_array(max_v,0)
max_tid=-1
dist=0.5
line1=500
line2=1000

#use prev
#################################


tid=track.track_id
ymin=bbox[1]
if(tid<max_v):
if(mf[tid]==0 and ymin>=line1):
mf[tid]=1
enter_frame[tid]=frame_num
if(mf[tid]==1 and ymin>=line2):
mf[tid]=2
exit_frame[tid]=frame_num
max_tid=max(max_tid,tid)

###################################


j=0
while(j<max_v and j<=max_tid):
if(mf[tid]==2):
speed=get_speed(dist,enter_frame[tid],exit_frame[tid],mfps)
2 changes: 1 addition & 1 deletion tools/generate_detections.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def extract_image_patch(image, bbox, patch_shape):

# convert to top left, bottom right
bbox[2:] += bbox[:2]
bbox = bbox.astype(np.int)
bbox = bbox.astype(int)

# clip at image boundaries
bbox[:2] = np.maximum(0, bbox[:2])
Expand Down