fixed incorrect json formatting in prompt

landing-ai · Oct 2, 2024 · eda5d4c · eda5d4c
1 parent b9cce16
commit eda5d4c
Showing 1 changed file with 5 additions and 5 deletions.
diff --git a/vision_agent/agent/vision_agent_prompts.py b/vision_agent/agent/vision_agent_prompts.py
@@ -26,8 +26,8 @@
 
 **Instructions**:
 1. **Understand and Clarify**: Make sure you understand the task, ask clarifying questions if the task is not clear.
-2. **Code Generation**: Only use code provided in the Documentation in your <execute_python> tags. For generating vision code always start by calling `generate_vision_plan`, once you've agreed upon a plan with the user then call `generate_vision_code`. Only use `edit_vision_code` to modify code written by `generate_vision_code`.
-3. **Execute**: Do only what the user asked you to do and no more. If you need to ask the user a question, set `let_user_respond` to `true`.
+2. **Code Generation**: Only use code provided in the Documentation in your <execute_python> tags. For generating vision code always start by calling `generate_vision_plan` and asking the user which plan they want to use, after that call `generate_vision_code` with the chosen plan. Only use `edit_vision_code` to modify code written by `generate_vision_code`.
+3. **Execute**: Do only what the user asked you to do and no more, do not debug vision code for them. If you need to ask the user a question, set `let_user_respond` to `true`.
 4. **Output in JSON**: Respond in the following format in JSON:
 
 ```json
@@ -67,7 +67,7 @@
 USER: Plan 1 sounds good to me with your modifications.
 
 
-AGENT: {"thoughts": "I will use the generate_vision_code to detect the dogs in the image.", "response": "<execute_python>generate_vision_code(artifacts, 'dog_detector.py', 'Can you write code to detect dogs in this image?', media=['/path/to/images/dog.jpg'], plan={'thoughts': 'The most straightforward and reliable approach is to use the owl_v2_image tool, which is specifically designed for object detection. It can detect multiple objects given a text prompt, which is perfect for identifying dogs in an image.', 'instructions': ['Load the image 'dogs.jpg' using the load_image tool', 'Use the owl_v2_image tool with the prompt 'dog' to detect dogs in the loaded image', 'Overlay the detected bounding boxes on the original image using the overlay_bounding_boxes tool', 'Save the resulting image with bounding boxes using the save_image tool']}, 'Add a confidence threshold of 0.5 to filter out the low-confidence detections.', 'dog_detector_plan.json')</execute_python>", "let_user_respond": false}
+AGENT: {"thoughts": "I will use the generate_vision_code to detect the dogs in the image.", "response": "<execute_python>generate_vision_code(artifacts, 'dog_detector.py', 'Can you write code to detect dogs in this image?', media=['/path/to/images/dog.jpg'], plan={'thoughts': 'The most straightforward and reliable approach is to use the owl_v2_image tool, which is specifically designed for object detection. It can detect multiple objects given a text prompt, which is perfect for identifying dogs in an image.', 'instructions': ['Load the image \'dogs.jpg\' using the load_image tool', 'Use the owl_v2_image tool with the prompt \'dog\' to detect dogs in the loaded image', 'Overlay the detected bounding boxes on the original image using the overlay_bounding_boxes tool', 'Save the resulting image with bounding boxes using the save_image tool']}, 'Add a confidence threshold of 0.5 to filter out the low-confidence detections.', 'dog_detector_plan.json')</execute_python>", "let_user_respond": false}
 
 OBSERVATION:
 [Artifact dog_detector.py]
@@ -176,7 +176,7 @@
 
 USER: Could you use plan 1 with owl_v2 instead and a confidence threshold of 0.3?
 
-AGENT: {"thoughts": "I will incorporate the users thoughts when generating the vision code.", "response": "<execute_python>generate_vision_code(artifacts, 'code.py', 'Can you create a function to count workers with helmets, return the count and save a visaulization of the bounding boxes?', media=['/path/to/images/workers.png'], plan={'thoughts': 'This plan uses owl_v2_image, which is specifically designed for detecting and counting multiple objects given a text prompt. It's likely the most accurate and efficient method for this task.', 'instructions': ['Load the image 'workers.png' using the load_image function', 'Use owl_v2_image with the prompt 'workers with helmets' to detect and count workers wearing helmets', 'Count the number of detections returned by owl_v2_image', 'Use overlay_bounding_boxes to visualize the detections on the original image', 'Save the visualization using save_image', 'Return the count of workers with helmets'], 'Add a confidence threshold of 0.3 to filter out low-confidence detections.', 'workers_plan_context.json'})</execute_python>", "let_user_respond": false}
+AGENT: {"thoughts": "I will incorporate the users thoughts when generating the vision code.", "response": "<execute_python>generate_vision_code(artifacts, 'code.py', 'Can you create a function to count workers with helmets, return the count and save a visaulization of the bounding boxes?', media=['/path/to/images/workers.png'], plan={'thoughts': 'This plan uses owl_v2_image, which is specifically designed for detecting and counting multiple objects given a text prompt. It's likely the most accurate and efficient method for this task.', 'instructions': ['Load the image \'workers.png\' using the load_image function', 'Use owl_v2_image with the prompt \'workers with helmets\' to detect and count workers wearing helmets', 'Count the number of detections returned by owl_v2_image', 'Use overlay_bounding_boxes to visualize the detections on the original image', 'Save the visualization using save_image', 'Return the count of workers with helmets'], 'Add a confidence threshold of 0.3 to filter out low-confidence detections.', 'workers_plan_context.json'})</execute_python>", "let_user_respond": false}
 
 OBSERVATION:
 [Artifact code.py]
@@ -215,7 +215,7 @@
 OBSERVATION:
 [Fine tuning id: 23b3b022-5ebf-4798-9373-20ef36429abf]
 
-AGENT: {"thoughts": "The model has finished fine tuning, I will now replace the original owl_v2_image call with the fine tuning id.", "response": "I will not update the code to use the fine tuned model. <execute_python>use_object_detection_fine_tuning(artifacts, "code.py", "23b3b022-5ebf-4798-9373-20ef36429abf")</execute_python>", "let_user_respond": false}
+AGENT: {"thoughts": "The model has finished fine tuning, I will now replace the original owl_v2_image call with the fine tuning id.", "response": "I will not update the code to use the fine tuned model. <execute_python>use_object_detection_fine_tuning(artifacts, 'code.py', '23b3b022-5ebf-4798-9373-20ef36429abf')</execute_python>", "let_user_respond": false}
 
 OBSERVATION:
 [Artifact code.py edits]