Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Long Term Memory and Feedback #80

Merged
merged 10 commits into from
May 13, 2024
Merged

Add Long Term Memory and Feedback #80

merged 10 commits into from
May 13, 2024

Conversation

dillonalaird
Copy link
Member

Adding the remaining two items, long term memory and feedback, for the programming Vision Agent. Tried to make the vision agent more stateless. Calling chat_with_workflow returns a lot of stuff now so that the agent doesn't have to hold on to it as state:

working_memory is the trial and error reflections the model creates when debugging failed code. You can obtain this and use it as long term memory for future usage:

output = agent.chat_with_workflow([{"role": "user", "content": "..."}])
wm = output["working_memory"]

# can save and load it
wm.save("working_mem")
wm = va.utils.load_sim("working_mem")

# merge with existing long term memory
new_ltm = va.utils.merge_sim(wm, ltm)

# can use working memory as long term memory
agent = va.agent.VisionAgentV2(long_term_memory=new_ltm)

If a subtask in the plan fails, it will return the partially completed code and plan early. You can pass a partially completed plan/conversation back to the agent to finish:

output = agent.chat_with_workflow([{"role": "user", "content": "can you code this?"}])

output = agent.chat_with_workflow(
    [{
        "role": "user",
        "content": "can you code this?"
    }, {
        "role": "assistant",
        "content": output["code"],
    }, {
        "role": "user",
        "content": "No, can you use this library?"
    }],
    plan=output["plan"],
)

Or if you want to converse with the agent (passing the old plan back is optional and probably only useful if some part of the original plan failed). This way the chat itself stays stateless, and you can track the conversation/plan.

Copy link
Collaborator

@shankar-vision-eng shankar-vision-eng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some questions

vision_agent/agent/vision_agent_v2.py Show resolved Hide resolved
data["desc"].append(key)
data["doc"].append("\n".join(value))
df = pd.DataFrame(data) # type: ignore
return Sim(df, sim_key="desc")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe what happens in this function ? Sim returns a df with embedding calculated on the given column. From what i see we build df from a working memory which contains desc and doc. Are the desc and doc are description and doc string of tools or are they something else ? because i was under the assumption working memory is the all the artifacts from a previous run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, for memory desc and doc are probably not the best terminology. In this case, desc is the subtask string and doc is the debug attempts at trying to build code for that subtask. For example, you might have:

desc: "load the file 'dog.png'"
doc: """
image = open('cat.png')

reflection: You opened the wrong image name, it should be 'dog.png'

image = open('dog.png'
"""

So this context could be saved as long term memory. Then the next time the agent encounters the question "load the file 'dog.png'" it could retrieve this context to help it.

vision_agent/agent/vision_agent_v2.py Show resolved Hide resolved
vision_agent/agent/vision_agent_v2.py Show resolved Hide resolved
vision_agent/agent/vision_agent_v2.py Show resolved Hide resolved
Copy link
Collaborator

@shankar-vision-eng shankar-vision-eng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dillonalaird dillonalaird merged commit d6fd63e into main May 13, 2024
7 checks passed
@dillonalaird dillonalaird deleted the add-mem-feedback branch May 14, 2024 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants