Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add DINOv as a new tool * Fix lint errors * Update docs * Fix param name mismatch (#45) Co-authored-by: Yazhou Cao <[email protected]> * Grammar/Spelling fixes (#46) * Update prompts.py * Update vision_agent_prompts.py * Update reflexion_prompts.py * Update vision_agent_prompts.py * Update easytool_prompts.py * Update prompts.py * Update vision_agent_prompts.py * Switch to the tools endpoint (#40) * get endpoint ready for demo fixed tools.json Update vision_agent/tools/tools.py Bug fixes * Fix linter errors * Fix a bug in result parsing * Include scores in the G-SAM model response * Removed tools.json , need to find better format * Fixing the endpoint for CLIP and adding thresholds for grounding tools * fix mypy errors * fixed example notebook --------- Co-authored-by: Yazhou Cao <[email protected]> Co-authored-by: shankar_ws3 <[email protected]> * Support streaming chat logs of an agent (#47) Add a callback for reporting the chat progress of an agent * Empty-Commit * Empty-Commit: attempt to fix release * [skip ci] chore(release): vision-agent 0.0.49 * Fix typo (#48) Co-authored-by: Yazhou Cao <[email protected]> * [skip ci] chore(release): vision-agent 0.0.50 * Fix a typo in log (#49) Fix another typo Co-authored-by: Yazhou Cao <[email protected]> * [skip ci] chore(release): vision-agent 0.0.51 * Fix Baby Cam Use Case (#51) * fix visualization error * added font and score to viz * changed to smaller font file * Support streaming chat logs of an agent (#47) Add a callback for reporting the chat progress of an agent * fix visualize score issue * updated descriptions, fixed counter bug * added visualize_output * make feedback more concrete * made naming more consistent * replaced individual calc ops with calculator tool * fix random colors * fix prompts for tools * update reflection prompt * update readme * formatting fix * fixed mypy errors * fix merge issue --------- Co-authored-by: Asia <[email protected]> * [skip ci] chore(release): vision-agent 0.0.52 * Add image caption tool (#52) added image caption tool * [skip ci] chore(release): vision-agent 0.0.53 * refactor: switch model endpoints (#54) * Switch the host of model endpoint to api.dev.landing.ai * DRY/Abstract out the inference code in tools * Introduce LandingaiAPIKey and support loading from .env file * Add integration tests for four model tools * Minor tweaks/fixes * Remove dead code * Bump the minor version to 0.1.0 * [skip ci] chore(release): vision-agent 0.1.1 * Pool Demo (#53) * visualized output/reflection to handle extract_frames_ * remove ipdb * added json mode for lmm, upgraded gpt-4-turbo * updated reflection prompt * refactor to make function simpler * updated reflection prompt, add tool usage doc * fixed format issue * fixed type issue * fixed test case * [skip ci] chore(release): vision-agent 0.1.2 * feat: allow disable motion detection in frame extraction function (#55) * Tweak frame extraction function * remove default motion detection, extract at 0.5 fps * lmm now take multiple images * removed counter * tweaked prompt * updated vision agent to reflect on multiple images * fix test case * added box distance * adjusted prompts --------- Co-authored-by: Yazhou Cao <[email protected]> Co-authored-by: Dillon Laird <[email protected]> * [skip ci] chore(release): vision-agent 0.1.3 * doc changes * fixed merge issues * fix color issue * add dinov with updated endpoint * formatting fix * added reference mask support * fix linting --------- Co-authored-by: Yazhou Cao <[email protected]> Co-authored-by: Cameron Maloney <[email protected]> Co-authored-by: shankar_ws3 <[email protected]> Co-authored-by: Dillon Laird <[email protected]> Co-authored-by: Shankar <[email protected]>
- Loading branch information