Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding OWLViT/OWLV2 as options for the visual grounding part #55

Open
skulshreshtha opened this issue Apr 5, 2024 · 3 comments
Open
Labels
enhancement New feature or request

Comments

@skulshreshtha
Copy link

🚀 Feature

Currently, the project uses GroundingDINO as the visual grounding model which is the best performing model for some benchmark datasets
current benchmarks for zero-shot object detection
We can provide the user flexibility to choose between different visual grounding models like

Motivation & Examples

Tell us why the feature is useful.
Since this project is about text guided segmentation, adding the ability to choose the technique for visual grounding pipeline seems like a natural addition.

Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.

from PIL import Image
from lang_sam import LangSAM

# Initialize and select visual grounding model if desired. Default will be 'groundingdino'. Other options are 'ofa', 'owlvit', and 'owlv2'
model = LangSAM(model = 'groundingdino') 
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)

Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.

@skulshreshtha skulshreshtha added the enhancement New feature or request label Apr 5, 2024
@luca-medeiros
Copy link
Owner

@skulshreshtha Interesting!
Do you want to try an implementation for it?

@skulshreshtha
Copy link
Author

@luca-medeiros Yes, sure. If you think this makes sense, I can try and raise a PR for this.

@ogencoglu
Copy link

+1 for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants