Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InferenceSlicer] - allow batch size inference #781

Open
inakierregueab opened this issue Jan 25, 2024 · 21 comments
Open

[InferenceSlicer] - allow batch size inference #781

inakierregueab opened this issue Jan 25, 2024 · 21 comments
Labels
enhancement New feature or request Q2.2024 Tasks planned for execution in Q2 2024.

Comments

@inakierregueab
Copy link

inakierregueab commented Jan 25, 2024

Description

Currently, sv.InferenceSlicer processes each slice in a separate callback call - hindering inference with a batch size larger than 1. We can change this by:

  • Batching Slices: Instead of submitting individual tasks for each slice, group slices into batches. batch_size can be a new parameter for the InferenceSlicer class.
  • Modifying the Callback: Ensure the callback function can handle a batch of slices instead of a single slice. Changing the callback signature from callback: Callable[[np.ndarray], Detections] to callback: Callable[[List[np.ndarray]], List[Detections]].
  • Collecting and Merging Results: After processing, you must appropriately collect and merge the results from the batches.

Additional

  • Note: Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will definitely speed up the review process. Each change must be tested by the reviewer. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! 🙏🏻
@inakierregueab inakierregueab added the enhancement New feature or request label Jan 25, 2024
@SkalskiP
Copy link
Collaborator

Hi, @inakierregueab 👋🏻 That is something we were considering but didn't implement due to time restrictions. Let me add some details to this issue. Maybe someone will pick it up.

@SkalskiP SkalskiP changed the title Batch size inference slicer [InferenceSlicer] - allow batch size inference Jan 25, 2024
@SkalskiP SkalskiP added the Q1.2024 Tasks planned for execution in Q1 2024. label Jan 25, 2024
@Bhavay-2001
Copy link
Contributor

Hi @SkalskiP, can I work on this issue if it is for beginners?
Thanks

@SkalskiP
Copy link
Collaborator

Hi, @Bhavay-2001 👋🏻 Do you already have experience with running model inference at different batch sizes?

@Bhavay-2001
Copy link
Contributor

Hi @SkalskiP, yes I think I can manage that. Can you please let me know how to proceed with this? Thanks

@SkalskiP
Copy link
Collaborator

Great! Do you have any specific questions?

@Bhavay-2001
Copy link
Contributor

Hi @SkalskiP, how to add batch_size feature in the Inference Class. How can I test in google colab? Any start point that can help me get on track will be helpful.

@SkalskiP
Copy link
Collaborator

I outlined vital steps that need to be taken to add batch_size support in task description. I think you should just try to implement it, get first working version and submit PR so we could review it.

@Bhavay-2001
Copy link
Contributor

Hi @SkalskiP, can you please refer me some code sample that is already been implemented and provides the batch_size functionality?

@SkalskiP
Copy link
Collaborator

@Bhavay-2001, I'm afraid we do not have a code sample. Implementing batch inference was supposed to be executed in this task. :/

@Bhavay-2001
Copy link
Contributor

@SkalskiP, What I am thinking of doing is to implement a for loop with batch of images. Each image is then passed to the model and detections are collected and at the end the detections are returned for the batch.

@Bhavay-2001
Copy link
Contributor

Hi @SkalskiP, can you please review this PR?

@Bhavay-2001
Copy link
Contributor

Hi @SkalskiP, can you please review and let me know. Thanks

@SkalskiP SkalskiP added Q2.2024 Tasks planned for execution in Q2 2024. and removed Q1.2024 Tasks planned for execution in Q1 2024. labels Apr 8, 2024
@LinasKo
Copy link
Collaborator

LinasKo commented Apr 10, 2024

Me and SkalskiP had a conversation about this - I'll take over for now.

@LinasKo
Copy link
Collaborator

LinasKo commented Apr 10, 2024

Intermediate results:

  1. I've confirmed that threads help, especially when the model is run on the CPU. I see a 5-10x performance improvement.
  2. I've implemented the batched inference slicer, allowing users to input both images and lists of images.
  3. Threading implementation is kept, docs written to point to either batch=N; threads=1 or batch=1; threads=N, depending on GPU / CPU needs.

Testing more broadly, however, provides mixed results.

  1. On my machine, batching provides a speed boost for ultralytics, does nothing for transformers (GPU) and inference (CPU, I believe).
  2. Using threads=8 slows down the ultralytics, batch=1 case, compared to threads=1. Only slower on my machine. In Colabs it's faster.

Still checking transformers - there's an obvious speedup with GPU, but I ran out of memory when trying with batching.

Colab coming soon.

@LinasKo
Copy link
Collaborator

LinasKo commented Apr 10, 2024

https://colab.research.google.com/drive/1j85QErM74VCSLADoGliM296q4GFUdnGM?usp=sharing

As you can see, in these tests it only helped the Ultralytics case.

Known insufficiencies:

  • Inference 1 model is fit for vehicle detection but is tested on an image with people.
  • No image to check how well it performed.
  • No tests for auto-batch case (when max_batch_size=-1).
  • Missing examples in dosctring: normal vs batch callback
  • No improvements to nms efficiency.

@LinasKo
Copy link
Collaborator

LinasKo commented Apr 10, 2024

PR: #1108

@LinasKo
Copy link
Collaborator

LinasKo commented Apr 11, 2024

@SkalskiP,
Ready for review, details in #1108.

@Geethen
Copy link

Geethen commented Jun 20, 2024

This batched inference slicer does not write the detections to file. Also, a drawback of the initial inferenceslicer is it assumes that the entire image can be read into memory. This may not be the case when dealing with large satellite images. A solution to this is windowed reading and writing. the rasterio package offers windowed reading and writing rasterio windowed read and writes.

@LinasKo
Copy link
Collaborator

LinasKo commented Jun 20, 2024

Hi @Geethen,

You brought up very good points. Indeed, the slicer is more suited for small objects rather than large images. When dealing with the latter, it will hog all available memory. I'll bring this up in our internal discussions.

As for saving the results, that's decoupled. Check out Sinks.

@Geethen
Copy link

Geethen commented Jun 20, 2024 via email

@linas
Copy link

linas commented Jun 20, 2024

Hi @Geethen You've got the wrong Linas. I'm Linas Vepštas. You want @LinasKo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Q2.2024 Tasks planned for execution in Q2 2024.
Projects
6 participants