-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About vision agent always running #167
Comments
Thanks for testing this out! I see you're also using a custom tool. Is the issue here that it keeps re-trying but never exits? We have a couple areas where it retries right now:
If you could share your prompt and custom tools I can run it on my end and try to reproduce. If you are uncomfortable sharing here you can also reach out to me on Discord and share privately https://discord.gg/RVcW3j9RgR |
Yes, I think the problem is the same as what you said. Of course, my prompt is def Dalle2_text2img(
prompt: str,
)-> Dict[str, Any]:
"""'Dalle2_text2img' is a tool for generating images from text prompts using a
Dalle2 model. This function allows the user to specify prompts, providing finer control over the generation process.
Parameters:
prompt (str): The text prompt to generate the image from. This is the main description of the desired image.
Returns:
Dict[str, Any]: A dictionary containing the generated image url.
Example
-------
>>> Dalle2_text2img(prompt="a photo of young girl")
{'imgfile': 'https://oaidalleapiprodscus.blob.core.windows.net/private/org-TeSt22FBWEC7NQLwRraLiXm8/user-3LIIFxvhKPI8jsc4MSblg3KD/img-5vStBsnLxTSiOcEr35JkePBb.png?st=2024-07-03T08%3A34%3A29Z&se=2024-07-03T10%3A34%3A29Z&sp=r&sv=2023-11-03&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-07-03T02%3A51%3A34Z&ske=2024-07-04T02%3A51%3A34Z&sks=b&skv=2023-11-03&sig=DMyFFUQEASKKPDaTKEOqoW5M8ygx6OOhiThmjW/Fz84%3D'}
"""
url = "https://api.openai.com/v1/images/generations"
headers = {
"Authorization": _OPENAI_API_KEY,
"Content-Type": "application/json"
}
data = {
"prompt": prompt,
"n": 1, # Number of images to generate
"size": "256x256" # Size of the generated image
}
response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
image_url = response.json()["data"][0]["url"]
print(f"Generated image URL: {image_url}")
else:
print(f"Failed to generate image. Status code: {response.status_code}")
print(response.json())
outdata = {
"imgfile": image_url
}
return outdata def Dalle2_prompt_gen(
text: str
)-> Dict[str, Any]:
"""'Dalle2_prompt_gen' is a tool for writing prompt for Dalle model to generate diffusion image. This function imagine relevant scenes or objects and returns a list of words which are visually specific concepts and vibes.
Parameters:
text (str): The input text
Returns:
List[Dict[str, Any]]: A list of dictionaries containing the generated prompt for diffusion model Dalle.
Example
-------
>>> Dalle2_prompt_gen(text="beautiful young woman")
[
{'prompt': 'a photo of bella'},
{'prompt': 'a photo of young woman with yellow hat'},
]
"""
client = OpenAI(
# This is the default and can be omitted
api_key=os.environ.get("OPENAI_API_KEY"),
)
chat_completion = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a Diffusion model prompt generator. You onlu output English. You should imagine relevant scenes or objects and returns a list of words which are visually specific concepts and vibes. For example, if the input prompt is Plants, output a python list of length 2 as follows: ['a photo of a tree', 'a photo of grass'] "},
{"role": "user", "content": text}
],
model="gpt-3.5-turbo",
)
return_data=[]
promptlist = ast.literal_eval(chat_completion.choices[0].message.content)
for i in range(len(promptlist)):
return_data.append(
{
"prompt": promptlist[i]
}
)
return return_data def Dalle2_imgvaria(
imgurl: str,
)-> Dict[str, Any]:
"""'Dalle2_imgvaria' is a tool for generating variated images from an input image using a
Dalle2 model. This function allows the user to reimagin more different but senmantic similar images.
Parameters:
imgurl (str): The input image from. This is the main reference of the desired image.
Returns:
Dict[str, Any]: A dictionary containing the generated image filename.
Example
-------
>>> Dalle2_imgvaria(imgurl="https://xxxx/e02210d7-5ce3-4230-b4a3-918066d1c6fc_20231028005328.jpg")
{'imgfile': './tmp.jpeg'}
"""
url = "https://api.openai.com/v1/images/generations"
headers = {
"Authorization": _OPENAI_API_KEY,
"Content-Type": "application/json"
}
with open(image_path, "rb") as image_file:
image_data = image_file.read()
# If necessary, resize and convert the image
image = Image.open(io.BytesIO(image_data))
image = image.resize((256, 256)) # Resize to 1024x1024 if required
buffered = io.BytesIO()
image.save(buffered, format="PNG")
image_data = buffered.getvalue()
multipart_data = {
"image": ("image.png", image_data, "image/png")
}
response = requests.post(url, headers=headers, files=multipart_data)
if response.status_code == 200:
image_url = response.json()["data"][0]["url"]
print(f"Generated image variation URL: {image_url}")
else:
print(f"Failed to generate image variation. Status code: {response.status_code}")
print(response.json())
outdata = {
"imgfile": image_url
}
return outdata The above three custom tools need to be added to the tool list and need to be set When I run my prompt, I thought it should exit after three unsuccessful attempts to modify the code, but it turns out that after three modifications, it starts to regenerate the plan and modify the code. Maybe you can explain to me what the default process is, including how many times you choose plan and how many times you fix code. |
Got it, thanks for the explanation and context. Here's the current retries it does:
However, the outer loop was originally because it would reflect on failed output and retry again. I will remove this in a PR since I think it's more likely to cause confusion now (the reflection didn't work that well anyways). To get your code running I modified a few things:
On this PR #169 I've removed the outer loop and also increased the number of error lines sent to the debugger so it doesn't get stuck as easily. Here's the modified code I ran: import ast
import io
import json
import os
from typing import Any, Dict, List
import requests
from openai import OpenAI
from PIL import Image
import vision_agent as va
@va.tools.register_tool(imports=["import os", "import requests", "import json"])
def Dalle2_text2img(
prompt: str,
) -> Dict[str, Any]:
"""'Dalle2_text2img' is a tool for generating images from text prompts using a
Dalle2 model. This function allows the user to specify prompts, providing finer
control over the generation process.
Parameters:
prompt (str): The text prompt to generate the image from. This is the main
description of the desired image.
Returns:
Dict[str, Any]: A dictionary containing the generated image url.
Example
-------
>>> Dalle2_text2img(prompt="a photo of young girl")
{'imgfile': 'https://oaidalleapiprodscus.blob.core.windows.net/private/org-TeSt22FBWEC7NQLwRraLiXm8/user-3LIIFxvhKPI8jsc4MSblg3KD/img-5vStBsnLxTSiOcEr35JkePBb.png?st=2024-07-03T08%3A34%3A29Z&se=2024-07-03T10%3A34%3A29Z&sp=r&sv=2023-11-03&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-07-03T02%3A51%3A34Z&ske=2024-07-04T02%3A51%3A34Z&sks=b&skv=2023-11-03&sig=DMyFFUQEASKKPDaTKEOqoW5M8ygx6OOhiThmjW/Fz84%3D'}
"""
url = "https://api.openai.com/v1/images/generations"
headers = {
"Authorization": f"Bearer {os.environ.get('OPENAI_API_KEY')}",
"Content-Type": "application/json",
}
data = {
"prompt": prompt,
"n": 1, # Number of images to generate
"size": "256x256", # Size of the generated image
}
response = requests.post(url, headers=headers, data=json.dumps(data))
image_url = None
if response.status_code == 200:
image_url = response.json()["data"][0]["url"]
print(f"Generated image URL: {image_url}")
else:
print(f"Failed to generate image. Status code: {response.status_code}")
print(response.json())
outdata = {"imgfile": image_url}
return outdata
@va.tools.register_tool(
imports=["import os", "from openai import OpenAI", "import ast"]
)
def Dalle2_prompt_gen(text: str) -> List[Dict[str, Any]]:
"""'Dalle2_prompt_gen' is a tool for writing prompt for Dalle model to generate
diffusion image. This function imagine relevant scenes or objects and returns a list
of words which are visually specific concepts and vibes.
Parameters:
text (str): The input text
Returns:
List[Dict[str, Any]]: A list of dictionaries containing the generated prompt
for diffusion model Dalle.
Example
-------
>>> Dalle2_prompt_gen(text="beautiful young woman")
[
{'prompt': 'a photo of bella'},
{'prompt': 'a photo of young woman with yellow hat'},
]
"""
client = OpenAI(
# This is the default and can be omitted
api_key=os.environ.get("OPENAI_API_KEY"),
)
chat_completion = client.chat.completions.create(
messages=[
{
"role": "system",
"content": "You are a Diffusion model prompt generator. You onlu output English. You should imagine relevant scenes or objects and returns a list of words which are visually specific concepts and vibes. For example, if the input prompt is Plants, output a python list of length 2 as follows: ['a photo of a tree', 'a photo of grass'] ",
},
{"role": "user", "content": text},
],
model="gpt-3.5-turbo",
)
return_data = []
promptlist = ast.literal_eval(chat_completion.choices[0].message.content)
for i in range(len(promptlist)):
return_data.append({"prompt": promptlist[i]})
return return_data
@va.tools.register_tool(
imports=["import os", "import io", "from PIL import Image", "import requests", "import json"]
)
def Dalle2_imgvaria(
imgurl: str,
) -> Dict[str, Any]:
"""'Dalle2_imgvaria' is a tool for generating variated images from an input image
using a Dalle2 model. This function allows the user to reimagin more different but
senmantic similar images.
Parameters:
imgurl (str): The input image from. This is the main reference of the desired image.
Returns:
Dict[str, Any]: A dictionary containing the generated image filename.
Example
-------
>>> Dalle2_imgvaria(imgurl="https://xxxx/e02210d7-5ce3-4230-b4a3-918066d1c6fc_20231028005328.jpg")
{'imgfile': './tmp.jpeg'}
"""
url = "https://api.openai.com/v1/images/generations"
headers = {
"Authorization": os.environ.get("OPENAI_API_KEY"),
"Content-Type": "application/json",
}
image = Image.open(requests.get(imgurl, stream=True).raw)
image = image.resize((256, 256)) # Resize to 1024x1024 if required
buffered = io.BytesIO()
image.save(buffered, format="PNG")
image_data = buffered.getvalue()
multipart_data = {"image": ("image.png", image_data, "image/png")}
response = requests.post(url, headers=headers, files=multipart_data)
image_url = None
if response.status_code == 200:
image_url = response.json()["data"][0]["url"]
print(f"Generated image variation URL: {image_url}")
else:
print(
f"Failed to generate image variation. Status code: {response.status_code}"
)
print(response.json())
outdata = {"imgfile": image_url}
return outdata
if __name__ == "__main__":
# generate_and_save_images()
agent = va.agent.VisionAgent(verbosity=2)
resp = agent.chat_with_workflow(
[
{
"role": "user",
"content": "Create a python script that generates N pictures of beautiful young women and saves them to the output_images/ folder. To test, start with N=2."
}
]
)
with open("code.py", "w") as f:
f.write(f"{resp['code']}\n{resp['test']}") |
Here's the final code it generates (note I run this in the above file so I have the import numpy as np
import os
import requests
from PIL import Image
from io import BytesIO
from vision_agent.tools import save_image
def generate_images_of_beautiful_young_women(N):
# Ensure the output directory exists
output_dir = 'output_images'
if not os.path.exists(output_dir):
os.makedirs(output_dir)
else:
# Clear the output directory
for file in os.listdir(output_dir):
file_path = os.path.join(output_dir, file)
if os.path.isfile(file_path):
os.unlink(file_path)
# Generate prompts
prompts = Dalle2_prompt_gen("beautiful young woman")
# Loop through the prompts and generate images
for i in range(min(N, len(prompts))):
prompt = prompts[i]['prompt']
image_info = Dalle2_text2img(prompt)
image_url = image_info['imgfile']
# Download the image
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))
image_np = np.array(image)
# Save the image
file_path = os.path.join(output_dir, f"image_{i+1}.png")
save_image(image_np, file_path) If you are still having trouble getting it to work, I find it works better if you can have your functions return a numpy array rather than URl. That way you avoid the extra code of downloading the image from the image url. |
Thank you, I will try again to resolve this issue. In addition, I would also like to ask whether the final generated code can be automatically saved as a py file. Or can all the processes be printed to the log file, because the intermediate process is too long? |
That's a good suggestion, we can look in to adding those features. Currently to save the code output what I do is save the code from the response: import vision_agent as va
if __name__ == "__main__":
agent = va.agent.VisionAgent()
resp = agent.chat_with_workflow(...)
with open("code.py", "w") as f:
f.write(f"{resp['code']}\n{resp['test']}") Would you want something like |
Yes, if this function can be added, it will facilitate review and execution |
Hello, I modified max_retries. I don’t know if my understanding is correct. If it is incorrect, please correct me: first for user input, the agent will give three plans, then give the code and modify it repeatedly (the default is 3 times ); then three plans will be generated again for the same input, and the code will be modified repeatedly. I changed max_retries to 2, and the process time was greatly shortened. On average, each task will be executed for 760 seconds. |
I also found some problems when running. When write_plans(), the plan and the tools needed for the plan can be given. Why do I need retrieve_tools() later? |
This is correct, it's this loop https://github.com/landing-ai/vision-agent/blob/main/vision_agent/agent/vision_agent.py#L679 times either the loop for testing plans https://github.com/landing-ai/vision-agent/blob/main/vision_agent/agent/vision_agent.py#L201 or the loop for debugging the code https://github.com/landing-ai/vision-agent/blob/main/vision_agent/agent/vision_agent.py#L354 Both the inner loops only run if the code they write fails though. So they are only for debugging failed code. I have also removed the outerloop in this PR #169
|
Thank you very much for your prompt answer! |
I looked at the code and thought that if it still failed after three attempts, the process would be terminated. However, I ran several samples and none of them seemed to be completed by myself. After trying to debug 3 times, it still initializes the code and continues execution.
`INFO:vision_agent.agent.vision_agent:Start debugging attempt 3
WARNING:traitlets:Could not destroy zmq context for <jupyter_client.asynchronous.client.AsyncKernelClient object at 0x321d1f580>
Code and test after attempted fix:
============================== Code ==============================
1 from typing import *
2 from pillow_heif import register_heif_opener
3 register_heif_opener()
4 import vision_agent as va
5 from vision_agent.tools import register_tool
6
7 from typing import *
8 from pillow_heif import register_heif_opener
9 register_heif_opener()
10 import vision_agent as va
11 from vision_agent.tools import register_tool
12 # The fixed code is provided above.
13 # The fixed test code is provided above.
============================== Test ==============================
1 The fixed test code is provided above.
INFO:vision_agent.agent.vision_agent:Reflection: The error was due to invalid Python syntax. It seems like the lines causing the error were meant to be comments or placeholders, but they were not marked as comments. The fix was to comment out these lines.
Code execution result after attempted fix: ----- stdout -----
----- stderr -----
----- Error -----
Traceback (most recent call last):
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/vision_agent/utils/execute.py", line 573, in exec_cell
self.nb_client.execute_cell(cell, len(self.nb.cells) - 1)
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/jupyter_core/utils/init.py", line 165, in wrapped
return loop.run_until_complete(inner)
File "/opt/miniconda3/envs/ben/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
return future.result()
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/nbclient/client.py", line 1062, in async_execute_cell
await self._check_raise_for_error(cell, cell_index, exec_reply)
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/nbclient/client.py", line 918, in _check_raise_for_error
raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
from typing import *
from pillow_heif import register_heif_opener
register_heif_opener()
import vision_agent as va
from vision_agent.tools import register_tool
from typing import *
from pillow_heif import register_heif_opener
register_heif_opener()
import vision_agent as va
from vision_agent.tools import register_tool
The fixed code is provided above.
The fixed test code is provided above.
The fixed test code is provided above.
Cell In[1], line 13
The fixed test code is provided above.
^
SyntaxError: invalid syntax. Perhaps you forgot a comma?
Final code and tests:
============================== Code ==============================
1 from typing import *
2 from pillow_heif import register_heif_opener
3 register_heif_opener()
4 import vision_agent as va
5 from vision_agent.tools import register_tool
6
7 from typing import *
8 from pillow_heif import register_heif_opener
9 register_heif_opener()
10 import vision_agent as va
11 from vision_agent.tools import register_tool
12 # The fixed code is provided above.
13 # The fixed test code is provided above.
============================== Test ==============================
1 The fixed test code is provided above.
INFO:vision_agent.agent.vision_agent:
┍━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┑
│ instructions │
┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ Use the Dalle3_text2img tool to generate 20 images with the prompt 'Belle'. │
├─────────────────────────────────────────────────────────────────────────────────┤
│ Use the save_image tool to save each generated image to the specified directory │
│ '/Users/feifan/benchmark/first_test'. │
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┙
INFO:vision_agent.agent.vision_agent:
┍━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┑
│ instructions │
┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ Use the Dalle3_prompt_gen tool to generate prompts related to 'Belle'. │
├─────────────────────────────────────────────────────────────────────────────────┤
│ Use the Dalle3_text2img tool to generate 20 images using the generated prompts. │
├─────────────────────────────────────────────────────────────────────────────────┤
│ Use the save_image tool to save each generated image to the specified directory │
│ '/Users/feifan/benchmark/first_test'. │
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┙
INFO:vision_agent.agent.vision_agent:
┍━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┑
│ instructions │
┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ Use the Dalle3_text2img tool to generate images with the prompt 'Belle'. │
├──────────────────────────────────────────────────────────────────────────────┤
│ Use the vit_image_classification tool to classify each generated image. │
├──────────────────────────────────────────────────────────────────────────────┤
│ If the top label of the classification result is 'Belle', use the save_image │
│ tool to save the image to the specified directory │
│ '/Users/feifan/benchmark/first_test'. Repeat the process until 20 images are │
│ saved. │
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┙
INFO:vision_agent.agent.vision_agent:Tools Description:
'vit_image_classification' is a tool that can classify an image. It returns a list of classes and their probability scores based on image content.
'Dalle3_text2img' is a tool for generating images from text prompts using a Dalle3 model. This function allows the user to specify prompts, providing finer control over the generation process.
'clip' is a tool that can classify an image or a cropped detection given a list of input classes or tags. It returns the same list of the input classes along with their probability scores based on image content.
'save_video' is a utility function that saves a list of frames as a mp4 video file on disk.
'Dalle3_prompt_gen' is a tool for writing prompt for Dalle model to generate diffusion image. This function imagine relevant scenes or objects and returns a list of words which are visually specific concepts and vibes.
'save_image' is a utility function that saves an image to a file path.
'load_image' is a utility function that loads an image from the given file path string.
WARNING:traitlets:Could not destroy zmq context for <jupyter_client.asynchronous.client.AsyncKernelClient object at 0x321d1f4c0>
Initial code and tests:
============================== Code ==============================
1 from typing import *
2 from pillow_heif import register_heif_opener
3 register_heif_opener()
4 import vision_agent as va
5 from vision_agent.tools import register_tool
6
7
8 from vision_agent.tools import Dalle3_text2img, save_image, Dalle3_prompt_gen, vit_image_classification
9 import os
10
11 # Plan 1
12 output_dict = {}
13 for i in range(20):
14 image_dict = Dalle3_text2img('Belle')
15 image_url = image_dict['img_url']
16 image = load_image(image_url)
17 save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{i}.png'))
18 output_dict['Plan1'] = 'Images saved successfully'
19
20 # Plan 2
21 prompts = Dalle3_prompt_gen('Belle')
22 for i, prompt in enumerate(prompts):
23 image_dict = Dalle3_text2img(prompt['prompt'])
24 image_url = image_dict['img_url']
25 image = load_image(image_url)
26 save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{i}.png'))
27 output_dict['Plan2'] = 'Images saved successfully'
28
29 # Plan 3
30 saved_images = 0
31 while saved_images < 20:
32 image_dict = Dalle3_text2img('Belle')
33 image_url = image_dict['img_url']
34 image = load_image(image_url)
35 classification = vit_image_classification(image)
36 if classification['labels'][0] == 'Belle':
37 save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{saved_images}.png'))
38 saved_images += 1
39 output_dict['Plan3'] = 'Images saved successfully'
40
41 print(output_dict)
42
INFO:vision_agent.agent.vision_agent:Initial code execution result:
----- stdout -----
----- stderr -----
----- Error -----
Traceback (most recent call last):
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/vision_agent/utils/execute.py", line 573, in exec_cell
self.nb_client.execute_cell(cell, len(self.nb.cells) - 1)
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/jupyter_core/utils/init.py", line 165, in wrapped
return loop.run_until_complete(inner)
File "/opt/miniconda3/envs/ben/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
return future.result()
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/nbclient/client.py", line 1062, in async_execute_cell
await self._check_raise_for_error(cell, cell_index, exec_reply)
File "/opt/miniconda3/envs/ben/lib/python3.10/site-packages/nbclient/client.py", line 918, in _check_raise_for_error
raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
from typing import *
from pillow_heif import register_heif_opener
register_heif_opener()
import vision_agent as va
from vision_agent.tools import register_tool
from vision_agent.tools import Dalle3_text2img, save_image, Dalle3_prompt_gen, vit_image_classification
import os
Plan 1
output_dict = {}
for i in range(20):
image_dict = Dalle3_text2img('Belle')
image_url = image_dict['img_url']
image = load_image(image_url)
save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{i}.png'))
output_dict['Plan1'] = 'Images saved successfully'
Plan 2
prompts = Dalle3_prompt_gen('Belle')
for i, prompt in enumerate(prompts):
image_dict = Dalle3_text2img(prompt['prompt'])
image_url = image_dict['img_url']
image = load_image(image_url)
save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{i}.png'))
output_dict['Plan2'] = 'Images saved successfully'
Plan 3
saved_images = 0
while saved_images < 20:
image_dict = Dalle3_text2img('Belle')
image_url = image_dict['img_url']
image = load_image(image_url)
classification = vit_image_classification(image)
if classification['labels'][0] == 'Belle':
save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{saved_images}.png'))
saved_images += 1
output_dict['Plan3'] = 'Images saved successfully'
print(output_dict)
----- stdout -----
Generated image URL: https://sampool-bucket.cn-shanghai.oss.aliyun-inc.com/aispace/llm/dall-e-output/0ba4998be0b6987bb3956b62d4f16b83.png?Expires=2036147759&OSSAccessKeyId=LTAI5tKQQgDsNWULhifdJCPC&Signature=JRzRWpKLrPpFusEUUzTROLmZ08E%3D
NameError Traceback (most recent call last)
Cell In[1], line 16
14 image_dict = Dalle3_text2img('Belle')
15 image_url = image_dict['img_url']
---> 16 image = load_image(image_url)
17 save_image(image, os.path.join('/Users/feifan/benchmark/first_test', f'image_{i}.png'))
18 output_dict['Plan1'] = 'Images saved successfully'
NameError: name 'load_image' is not defined
WARNING:traitlets:Could not destroy zmq context for <jupyter_client.asynchronous.client.AsyncKernelClient object at 0x321f94940>
WARNING:traitlets:Could not destroy zmq context for <jupyter_client.asynchronous.client.AsyncKernelClient object at 0x321d1df60>`
The text was updated successfully, but these errors were encountered: