Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video and detection does not match. #4

Open
WilliamJuel opened this issue Oct 3, 2019 · 19 comments
Open

Video and detection does not match. #4

WilliamJuel opened this issue Oct 3, 2019 · 19 comments

Comments

@WilliamJuel
Copy link

It seems like the video linked to in the README.md (https://drive.google.com/open?id=1h2Wnb98tDVB6JlCDNQXCeZpG20x6AiZ2) does not match with the detections in Nanonets_object_tracking/det/.

In each of the det_*.txt files there are 1955 frames and the video consist of 2110 frames.
This is also confirmed visually (Bounding boxes (detections) are not matching where the cars actually are) when using either the given model640.pt or a self-trained feature extractor on the given data and the program crashes when trying to process frame 1956 (for good reason).

Is there a new video or what is going on here?

@johnnylord
Copy link

Same issue XD

@mswiniars
Copy link

Same problem, is this model working?

@yuntai
Copy link

yuntai commented Nov 2, 2019

I ran a detector on vdo.avi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

@Ujang24
Copy link

Ujang24 commented Dec 2, 2019

I ran a detector on vdo.vi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

Dear @yuntai , thank you for sharing. It is now working.
Btw, how did you get the detection? I mean, can you be more specific?

I saw you detection result such as this first line:
1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1

Could you share what are all these number means?

Many thanks

@aniketvartak
Copy link

aniketvartak commented Dec 3, 2019

Thanks @yuntai , that txt file works. and none of the other provided in the repo do.

@anzy0621
Copy link

@yuntai Hey, can you tell me how you generated the detections text file from the detectron2? I know we can generate output videos or images with detectron2 but not sure that it can generate the detections text file? Any help would be appreciated, thank you!

@MinaAbdElMassih
Copy link

MinaAbdElMassih commented Feb 17, 2020

1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1
-I think the first number represents the frame number and the instance detected in that frame, so when it's repeated it means the number of instances detected in that frame
-The -1s are just there as the code selects [2:6] only of each line you could either keep them like that or modify the code to select [1:5] and remove the rest of the -1s.
-The next 4 numbers represent x1, y1 ,w, h the bounding box of the detected object, so you can get the width and height from x2-x1 and y2-y1 which you can get from the bounding box info from let's say:
outputs["instances"].pred_boxes
that will give you the tensor and you can get the values from:
outputs["instances"].pred_boxes[i].tensor[0, 0].data.cpu().numpy() (tensor[0, 0] for x1)
You can find more about the data types from the detectron2 documentation:
https://detectron2.readthedocs.io/tutorials/models.html#model-input-format
-The last number (0.996153) i think represents the accuracy
You can basically write the numbers in that format in a text file and give the detections and the input video to the deepsort tracker and it should work fine. :)

@anzy0621
Copy link

anzy0621 commented Feb 22, 2020

1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1
-I think the first number represents the frame number and the instance detected in that frame, so when it's repeated it means the number of instances detected in that frame
-The second number i think is a non existent class (let's say human) also i think every -1 represents a non existent class
-The next 4 numbers represent x1, y1 ,w, h the bounding box of the detected object, so you can get the width and height from x2-x1 and y2-y1 which you can get from the bounding box info from let's say:
outputs["instances"].pred_boxes
that will give you the tensor and you can get the values from:
outputs["instances"].pred_boxes[i].tensor[0, 0].data.cpu().numpy() (tensor[0, 0] for x1)
You can find more about the data types from the detectron2 documentation:
https://detectron2.readthedocs.io/tutorials/models.html#model-input-format
-The last number (0.996153) i think represents the accuracy that it's the said class (let's say car)
-The rest of the -1s represent non existent classes in the frame
You can basically write the numbers in that format in a text file and give the detections and the input video to the deepsort tracker and it should work fine. :)

Hey, @MinaAbdElMassih thank you for the input! I appreciate it. I'm currently trying to output the text file. Have you tried doing this before?

@MinaAbdElMassih
Copy link

@anzy0621 I haven't done this before, I managed to modify the code of the detectors of detectron2 API to write the detections info in that format in a .txt file and it worked, It's quiet simple once you manage to get the needed values. :)

@AntonioMarsella
Copy link

If I have only two classes to detect, how do the columns become?

@utkutpcgl
Copy link

utkutpcgl commented Apr 10, 2020

None of the models in the repository is working for me either.
@yuntai # That worked for me. Thanks. Btw. good explanation @MinaAbdElMassih.

@MohamedMostafaSoliman
Copy link

1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1
-I think the first number represents the frame number and the instance detected in that frame, so when it's repeated it means the number of instances detected in that frame
-The second number i think is a non existent class (let's say human) also i think every -1 represents a non existent class
-The next 4 numbers represent x1, y1 ,w, h the bounding box of the detected object, so you can get the width and height from x2-x1 and y2-y1 which you can get from the bounding box info from let's say:
outputs["instances"].pred_boxes
that will give you the tensor and you can get the values from:
outputs["instances"].pred_boxes[i].tensor[0, 0].data.cpu().numpy() (tensor[0, 0] for x1)
You can find more about the data types from the detectron2 documentation:
https://detectron2.readthedocs.io/tutorials/models.html#model-input-format
-The last number (0.996153) i think represents the accuracy that it's the said class (let's say car)
-The rest of the -1s represent non existent classes in the frame
You can basically write the numbers in that format in a text file and give the detections and the input video to the deepsort tracker and it should work fine. :)

Thanks mina🤗

@MinaAbdElMassih
Copy link

MinaAbdElMassih commented Apr 14, 2020

@AntonioMarsella
I modified my answer above i think it better answers your question as the -1s don't represent classes.

@pooya-mohammadi
Copy link

I ran a detector on vdo.vi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

The video has been removed from google drive. Can you share yours please

@yuntai
Copy link

yuntai commented Jul 16, 2021

I ran a detector on vdo.vi and dumped out detection result which matches the video clip.
https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77
using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

The video has been removed from google drive. Can you share yours please

find . ... on my hd got me this one. can you check this is the correct one?
https://drive.google.com/file/d/1PTBXBfCKuSCNk6wUGZ7pAQyj4rcKRkql/view?usp=sharing

@yuntai
Copy link

yuntai commented Jul 16, 2021

I ran a detector on vdo.avi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

found this thread continued only now, but my local git repo is still there! in demo/predictor.py in detectron2 repo.

def process_detected_instance(predictions, frame_no):                
    global outf                                                      
    boxes = predictions.pred_boxes.tensor.numpy()                    
    scores = predictions.scores.numpy()                              
    classes = predictions.pred_classes.numpy()                       
    mask = np.isin(classes, [0,1,2,3,5,7])                           
    boxes = boxes[mask]                                              
    scores = scores[mask]                                            
    classes = classes[mask]                                          
    if outf is None:                                                 
      outf = open('det.txt', 'w')                                    
                                                                     
                                                                     
    for i in range(len(classes)):                                    
      x1, y1, x2, y2 = list(boxes[i])                                
      w = x2 - x1                                                    
      h = y2 - y1                                                    
      assert w > 0 and h > 0                                         
      print(','.join(                                                
        list(map(str,[frame_no,-1])) +                               
        list(map(str, [x1, y1, w, h])) + [str(scores[i])] + ['-1']*3)
      , file=outf, flush=True)                                       
    print("frame_no({}) num({})".format(frame_no, len(classes)))     

and added
process_detected_instance(predictions, frame_no)
under elif "instance" in predictions:... futher down below

@dvrbanic
Copy link

dvrbanic commented May 6, 2022

I ran a detector on vdo.vi and dumped out detection result which matches the video clip.
https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77
using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

The video has been removed from google drive. Can you share yours please

find . ... on my hd got me this one. can you check this is the correct one? https://drive.google.com/file/d/1PTBXBfCKuSCNk6wUGZ7pAQyj4rcKRkql/view?usp=sharing

Hi, could you please share your video again if you still have it because the link doesn't work anymore

@yuntai
Copy link

yuntai commented May 7, 2022

https://drive.google.com/file/d/1ADVZyR3BdWUm-saeM6GcFtbw6E2lUcKk/view?usp=sharing

@dvrbanic
Copy link

dvrbanic commented May 8, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests