Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor changes to support benchmarking #138

Merged
merged 5 commits into from
Jun 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion examples/mask_app/app.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import cv2
import streamlit as st
from PIL import Image
from streamlit_drawable_canvas import st_canvas

import streamlit as st

st.title("Image Segmentation Mask App")

uploaded_file = st.file_uploader("Choose an image...", type=["jpg", "png"])
Expand Down
6 changes: 3 additions & 3 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions vision_agent/agent/vision_agent_prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,8 @@ def find_text(image_path: str, text: str) -> str:
8. DO NOT use try except block to handle the error, let the error be raised if the code is incorrect.
9. DO NOT import the testing function as it will available in the testing environment.
10. Print the output of the function that is being tested.
11. Use the output of the function that is being tested as the return value of the testing function.
12. Run the testing function in the end and don't assign a variable to its output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to think of ways to keep the list of instructions shorter as I suspect as the list gets longer the ability to follow the directions get's poorer.

Would this also work if we removed instruction 10. (as it seems to accomplish the same thing as 11.) and combine 11. and 12. into "Return the output of the function that is being tested in the test script, do not assign it to another variable" or something shorter?

Copy link
Collaborator Author

@humpydonkey humpydonkey Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a few approaches. This actually is the best prompt so far.
Other approaches significantly increase the chance of the testing function return null.
Here are approaches i have tried:

  1. Removing 10
  2. Merge 11 + 12 into one instruction
  3. Rewrite 11 in a more concise way
  4. Move 11 + 12 to the 3th and 4th instruction
  5. Move all instructions up
    And some combinations of the above.

From my observation, this is not a problem, but shortening it can actually cause a big problem. i.e. the null values increases from 11% to 80% in my benchmark.
We probably want to rewrite/revisit this entire prompt at some point. It's a bit brittle now.

"""


Expand Down
19 changes: 17 additions & 2 deletions vision_agent/utils/execute.py
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,21 @@ def success(self) -> bool:
"""
return self.error is None

def get_main_result(self) -> Optional[Result]:
"""
Get the main result of the execution.
An execution may have multiple results, e.g. intermediate outputs. The main result is the last output of the cell execution.
"""
if not self.success:
_LOGGER.info("Result is not available as the execution was not successful.")
return None
if not self.results or not any(res.is_main_result for res in self.results):
_LOGGER.info("Execution was successful but there is no main result.")
return None
main_result = self.results[-1]
assert main_result.is_main_result, "The last result should be the main result."
dillonalaird marked this conversation as resolved.
Show resolved Hide resolved
return main_result

def to_json(self) -> str:
"""
Returns the JSON representation of the Execution object.
Expand Down Expand Up @@ -411,11 +426,11 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:
"""
import platform
import sys
import pkg_resources
import importlib.metadata

print(f"Python version: {sys.version}")
print(f"OS version: {platform.system()} {platform.release()} ({platform.architecture()})")
va_version = pkg_resources.get_distribution("vision-agent").version
va_version = importlib.metadata.version("vision-agent")
print(f"Vision Agent version: {va_version}")"""
)
sys_versions = "\n".join(result.logs.stdout)
Expand Down
Loading