Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update conversation #217

Merged
merged 37 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
6b8e58b
update for new conv
dillonalaird Aug 15, 2024
07a9b84
add artifact tools
dillonalaird Aug 15, 2024
8dede49
update local executor
dillonalaird Aug 15, 2024
97556be
fix upload/download
dillonalaird Aug 26, 2024
82169c2
cleaned up code for artifacts
dillonalaird Aug 26, 2024
d1f1602
starting artifact prompts
dillonalaird Aug 26, 2024
2fc76a5
app to add files to artifacts
dillonalaird Aug 28, 2024
11cef6f
add support for artifacts
dillonalaird Aug 28, 2024
0163daa
add artifact meta tools
dillonalaird Aug 28, 2024
2596f43
ran isort
dillonalaird Aug 28, 2024
51d49f5
prompt to work with artifacts
dillonalaird Aug 28, 2024
e31de9f
minor fixes for prompts
dillonalaird Aug 28, 2024
9e83881
add docs, fix load and saving remote files
dillonalaird Aug 28, 2024
84757f7
rename prompts
dillonalaird Aug 28, 2024
65c8cdb
add docs for artifacts, allow None artifacts (which don't load) to be…
dillonalaird Aug 28, 2024
b3c13b1
e2b and local uplaod/download work similarly now, can pass in target …
dillonalaird Aug 28, 2024
6ebb75b
add Artifacts to exports
dillonalaird Aug 28, 2024
907c449
local chat app to work with artifacts
dillonalaird Aug 28, 2024
bbae983
updated docs
dillonalaird Aug 28, 2024
3e7cfd2
fix flake8
dillonalaird Aug 28, 2024
afc87c0
fix mypy errors
dillonalaird Aug 28, 2024
4aa9fec
fix format
dillonalaird Aug 28, 2024
53dea57
add execution to conversation
dillonalaird Aug 29, 2024
e508809
fixed type errors
dillonalaird Aug 29, 2024
d83857e
fixed bug with upload file
dillonalaird Aug 29, 2024
51503b9
added ability to write media files to artifacts
dillonalaird Aug 29, 2024
0ed6bb7
return outside of context
dillonalaird Aug 29, 2024
04bd768
make remote path execute variable
dillonalaird Aug 29, 2024
9782893
add codec for video encoding
dillonalaird Aug 29, 2024
75c1289
fix prompts to include writing media artifacts
dillonalaird Aug 29, 2024
1d8dd78
isort
dillonalaird Aug 29, 2024
7a510e3
fix typo
dillonalaird Aug 29, 2024
ac9a5e0
added redisplay for nested notebook sessions
dillonalaird Aug 29, 2024
32b1ce9
return artifacts
dillonalaird Aug 30, 2024
33cf8e7
add trace for last edited artifact
dillonalaird Aug 30, 2024
40c1cbd
handle artifact return
dillonalaird Aug 30, 2024
58a1be4
only add text to obs, no trace
dillonalaird Aug 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,15 @@ export OPENAI_API_KEY="your-api-key"
```

### Vision Agent
There are two agents that you can use. Vision Agent is a conversational agent that has
There are two agents that you can use. `VisionAgent` is a conversational agent that has
access to tools that allow it to write an navigate python code and file systems. It can
converse with the user in natural language. VisionAgentCoder is an agent that can write
code for vision tasks, such as counting people in an image. However, it cannot converse
and can only respond with code. VisionAgent can call VisionAgentCoder to write vision
code.
converse with the user in natural language. `VisionAgentCoder` is an agent specifically
for writing code for vision tasks, such as counting people in an image. However, it
cannot chat with you and can only respond with code. `VisionAgent` can call
`VisionAgentCoder` to write vision code.

#### Basic Usage
To run the streamlit app locally to chat with Vision Agent, you can run the following
To run the streamlit app locally to chat with `VisionAgent`, you can run the following
command:

```bash
Expand Down Expand Up @@ -146,7 +146,7 @@ the code and having it update. You just need to add the code as a response from
assistant:

```python
agent = va.agent.VisionAgent(verbosity=2)
agent = va.agent.VisionAgentCoder(verbosity=2)
conv = [
{
"role": "user",
Expand Down Expand Up @@ -212,6 +212,10 @@ function. Make sure the documentation is in the same format above with descripti
`Parameters:`, `Returns:`, and `Example\n-------`. You can find an example use case
[here](examples/custom_tools/) as this is what the agent uses to pick and use the tool.

Can't find the tool you need and want add it to `VisionAgent`? Check out our
[vision-agent-tools](https://github.com/landing-ai/vision-agent-tools) repository where
we add the source code for all the tools used in `VisionAgent`.

## Additional Backends
### Ollama
We also provide a `VisionAgentCoder` that uses Ollama. To get started you must download
Expand Down
21 changes: 13 additions & 8 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,15 @@ export OPENAI_API_KEY="your-api-key"
```

### Vision Agent
There are two agents that you can use. Vision Agent is a conversational agent that has
There are two agents that you can use. `VisionAgent` is a conversational agent that has
access to tools that allow it to write an navigate python code and file systems. It can
converse with the user in natural language. VisionAgentCoder is an agent that can write
code for vision tasks, such as counting people in an image. However, it cannot converse
and can only respond with code. VisionAgent can call VisionAgentCoder to write vision
code.
converse with the user in natural language. `VisionAgentCoder` is an agent specifically
for writing code for vision tasks, such as counting people in an image. However, it
cannot chat with you and can only respond with code. `VisionAgent` can call
`VisionAgentCoder` to write vision code.

#### Basic Usage
To run the streamlit app locally to chat with Vision Agent, you can run the following
To run the streamlit app locally to chat with `VisionAgent`, you can run the following
command:

```bash
Expand Down Expand Up @@ -143,7 +143,7 @@ the code and having it update. You just need to add the code as a response from
assistant:

```python
agent = va.agent.VisionAgent(verbosity=2)
agent = va.agent.VisionAgentCoder(verbosity=2)
conv = [
{
"role": "user",
Expand Down Expand Up @@ -209,6 +209,10 @@ function. Make sure the documentation is in the same format above with descripti
`Parameters:`, `Returns:`, and `Example\n-------`. You can find an example use case
[here](examples/custom_tools/) as this is what the agent uses to pick and use the tool.

Can't find the tool you need and want add it to `VisionAgent`? Check out our
[vision-agent-tools](https://github.com/landing-ai/vision-agent-tools) repository where
we add the source code for all the tools used in `VisionAgent`.

## Additional Backends
### Ollama
We also provide a `VisionAgentCoder` that uses Ollama. To get started you must download
Expand All @@ -230,6 +234,7 @@ tools. You can use it just like you would use `VisionAgentCoder`:
>>> agent = va.agent.OllamaVisionAgentCoder()
>>> agent("Count the apples in the image", media="apples.jpg")
```
> WARNING: VisionAgent doesn't work well unless the underlying LMM is sufficiently powerful. Do not expect good results or even working code with smaller models like Llama 3.1 8B.

### Azure OpenAI
We also provide a `AzureVisionAgentCoder` that uses Azure OpenAI models. To get started
Expand All @@ -241,7 +246,7 @@ follow the Azure Setup section below. You can use it just like you would use=
>>> agent = va.agent.AzureVisionAgentCoder()
>>> agent("Count the apples in the image", media="apples.jpg")
```
> WARNING: VisionAgent doesn't work well unless the underlying LMM is sufficiently powerful. Do not expect good results or even working code with smaller models like Llama 3.1 8B.


### Azure Setup
If you want to use Azure OpenAI models, you need to have two OpenAI model deployments:
Expand Down
16 changes: 14 additions & 2 deletions examples/chat/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,14 @@
"response": "saved",
"style": {"bottom": "calc(50% - 4.25rem", "right": "0.4rem"},
}
agent = va.agent.VisionAgent(verbosity=1)
# set artifacts remote_path to WORKSPACE
artifacts = va.tools.Artifacts(WORKSPACE / "artifacts.pkl")
if Path("artifacts.pkl").exists():
artifacts.load("artifacts.pkl")
else:
artifacts.save("artifacts.pkl")

agent = va.agent.VisionAgent(verbosity=1, local_artifacts_path="artifacts.pkl")

st.set_page_config(layout="wide")

Expand All @@ -44,7 +51,9 @@


def update_messages(messages, lock):
new_chat = agent.chat_with_code(messages)
if Path("artifacts.pkl").exists():
artifacts.load("artifacts.pkl")
new_chat, _ = agent.chat_with_code(messages, artifacts=artifacts)
with lock:
for new_message in new_chat:
if new_message not in messages:
Expand Down Expand Up @@ -122,6 +131,9 @@ def main():
with open(WORKSPACE / uploaded_file.name, "wb") as f:
f.write(uploaded_file.getbuffer())

# make it None so it wont load and overwrite the image
artifacts.artifacts[uploaded_file.name] = None

for file in WORKSPACE.iterdir():
if "__pycache__" not in str(file) and not str(file).startswith("."):
if st.button(file.name):
Expand Down
2 changes: 1 addition & 1 deletion vision_agent/agent/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ def __call__(
self,
input: Union[str, List[Message]],
media: Optional[Union[str, Path]] = None,
) -> str:
) -> Union[str, List[Message]]:
pass

@abstractmethod
Expand Down
Loading
Loading