You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please be aware that using the API in this project requires you to have API credits (minimum of five US dollars). This is different from the OpenAI subscription used in this chatbot. If you don't have credit, further information can be found [here](https://github.com/landing-ai/vision-agent?tab=readme-ov-file#how-to-get-started-with-openai-api-credits)
45
45
46
+
46
47
### Vision Agent
47
-
#### Basic Usage
48
-
You can interact with the agent as you would with any LLM or LMM model:
48
+
There are two agents that you can use. Vision Agent is a conversational agent that has
49
+
access to tools that allow it to write an navigate python code. It can converse with
50
+
the user in natural language. VisionAgentCoder is an agent that can write code for
51
+
vision tasks, such as counting people in an image. However, it cannot converse and can
52
+
only respond with code. VisionAgent can call VisionAgentCoder to write vision code.
49
53
54
+
#### Basic Usage
50
55
```python
51
56
>>>from vision_agent.agent import VisionAgent
52
57
>>> agent = VisionAgent()
58
+
>>> resp = agent("Hello")
59
+
>>>print(resp)
60
+
[{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "{'thoughts': 'The user has greeted me. I will respond with a greeting and ask how I can assist them.', 'response': 'Hello! How can I assist you today?', 'let_user_respond': True}"}]
61
+
>>> resp.append({"role": "user", "content": "Can you count the number of people in this image?", "media": ["people.jpg"]})
62
+
>>> resp = agent(resp)
63
+
```
64
+
65
+
### Vision Agent Coder
66
+
#### Basic Usage
67
+
You can interact with the agent as you would with any LLM or LMM model:
Please be aware that using the API in this project requires you to have API credits (minimum of five US dollars). This is different from the OpenAI subscription used in this chatbot. If you don't have credit, further information can be found [here](https://github.com/landing-ai/vision-agent?tab=readme-ov-file#how-to-get-started-with-openai-api-credits)
37
37
38
+
38
39
### Vision Agent
39
-
#### Basic Usage
40
-
You can interact with the agent as you would with any LLM or LMM model:
40
+
There are two agents that you can use. Vision Agent is a conversational agent that has
41
+
access to tools that allow it to write an navigate python code. It can converse with
42
+
the user in natural language. VisionAgentCoder is an agent that can write code for
43
+
vision tasks, such as counting people in an image. However, it cannot converse and can
44
+
only respond with code. VisionAgent can call VisionAgentCoder to write vision code.
41
45
46
+
#### Basic Usage
42
47
```python
43
48
>>>from vision_agent.agent import VisionAgent
44
49
>>> agent = VisionAgent()
50
+
>>> resp = agent("Hello")
51
+
>>>print(resp)
52
+
[{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "{'thoughts': 'The user has greeted me. I will respond with a greeting and ask how I can assist them.', 'response': 'Hello! How can I assist you today?', 'let_user_respond': True}"}]
53
+
>>> resp.append({"role": "user", "content": "Can you count the number of people in this image?", "media": ["people.jpg"]})
54
+
>>> resp = agent(resp)
55
+
```
56
+
57
+
### Vision Agent Coder
58
+
#### Basic Usage
59
+
You can interact with the agent as you would with any LLM or LMM model:
0 commit comments