Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Count tools #56

Merged
merged 5 commits into from
Apr 22, 2024
Merged

Add Count tools #56

merged 5 commits into from
Apr 22, 2024

Conversation

shankar-vision-eng
Copy link
Collaborator

This PR adds 2 tools to the vision-agent

  • Zero shot counting - Counts total number of instances of an object belonging to the same class
  • Visual Prompt counting - Counts total number of instances of an object belonging to the same class, given an exemplar which is a bounding box "xmin, ymin, xmax, ymax"

vision_agent/tools/tools.py Outdated Show resolved Hide resolved
vision_agent/tools/tools.py Outdated Show resolved Hide resolved
vision_agent/tools/tools.py Outdated Show resolved Hide resolved
Copy link
Member

@dillonalaird dillonalaird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, few minor comments

vision_agent/tools/tools.py Outdated Show resolved Hide resolved
vision_agent/tools/tools.py Outdated Show resolved Hide resolved
Copy link
Member

@dillonalaird dillonalaird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot, make sure to add the new tools to the README tools section

…zed coordinates, refactoring code, adding llm generate counter tool
Copy link
Member

@dillonalaird dillonalaird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! left one optional comment. I think it's just failing tests because of linting so can merge once that's fixed.

A dictionary containing the key 'count' and the count as value. E.g. {count: 12}
"""
image_size = get_image_size(image)
bbox = [float(x) for x in prompt.split(",")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor, but I would just require the bounding boxes as input instead of a prompt so it doesn't have to worry about parsing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, i would make it generic so that it can be bbox or prompt. Will include it in the next PR

@shankar-vision-eng shankar-vision-eng merged commit 5f11aea into main Apr 22, 2024
7 checks passed
@shankar-vision-eng shankar-vision-eng deleted the add_counting_tool branch April 22, 2024 21:09
dillonalaird pushed a commit that referenced this pull request Apr 23, 2024
* Adding counting tools to vision agent

* fixed heatmap overlay and addressesessed PR comments

* adding the counting tool to take both absolute coordinate and normalized coordinates, refactoring code, adding llm generate counter tool

* fix linting
dillonalaird added a commit that referenced this pull request Apr 24, 2024
* added custom tools

* updated readme

* register tool returns tool'

* Add a new tool: determine if a bbox is contained within another bbox (#59)

* Add a new bounding box contains tool

* Fix format

* [skip ci] chore(release): vision-agent 0.1.5

* Add Count tools (#56)

* Adding counting tools to vision agent

* fixed heatmap overlay and addressesessed PR comments

* adding the counting tool to take both absolute coordinate and normalized coordinates, refactoring code, adding llm generate counter tool

* fix linting

* Remove torch and cuda dependencies (#60)

Resolve merge conflicts

* [skip ci] chore(release): vision-agent 0.2.1

* make it easier to use custom tools

* ran isort

* fix linting error

* added OCR

* added example template matching use case

* formatting and typing fix

* round scores

* fix readme typo

---------

Co-authored-by: Asia <[email protected]>
Co-authored-by: GitHub Actions Bot <[email protected]>
Co-authored-by: Shankar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants