You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the project uses GroundingDINO as the visual grounding model which is the best performing model for some benchmark datasets
We can provide the user flexibility to choose between different visual grounding models like
Tell us why the feature is useful.
Since this project is about text guided segmentation, adding the ability to choose the technique for visual grounding pipeline seems like a natural addition.
Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.
from PIL import Image
from lang_sam import LangSAM
# Initialize and select visual grounding model if desired. Default will be 'groundingdino'. Other options are 'ofa', 'owlvit', and 'owlv2'
model = LangSAM(model = 'groundingdino')
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)
Note
We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.
The text was updated successfully, but these errors were encountered:
🚀 Feature
Currently, the project uses
GroundingDINO
as the visual grounding model which is the best performing model for some benchmark datasetsWe can provide the user flexibility to choose between different visual grounding models like
Motivation & Examples
Tell us why the feature is useful.
Since this project is about text guided segmentation, adding the ability to choose the technique for visual grounding pipeline seems like a natural addition.
Describe what the feature would look like, if it is implemented.
Best demonstrated using code examples in addition to words.
Note
We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.
The text was updated successfully, but these errors were encountered: