Skip to content

lineick/PromptRefining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PromptRefining

Project Foundation Models WT23 Uni Stuttgart

  1. By creating a feedback loop between ChatGPT-4V and a Text-to-Image Model, we aim to improve the precision of the image generated by an original prompt.
  2. By evaluating the alignment iteratively, we can deduce blindspots in current TTI models and (multimodal) LLMs like GPT-4V.
  3. By detaching the comparison to the original prompt in the loop, we can visualize converging patterns of the TTI models and dig deeper into perceptual blindspots of GPT-4V.

Research Questions/Goals

  1. Feedback Loop Enhancement of TTI Models and ChatGPT-4V: "Does the integration of a feedback loop between ChatGPT-4V and a Text-to-Image (TTI) model enhance the precision of images generated based on an initial prompt?"
  2. Detecting Blindspots in Image Interpretation: "What are the blindspots of ChatGPT-4V in Image Recognition?"
  3. Detecting Blindspots in Image Generation and Prompt Interpretation: "What are the blindspots of current TTI models, when interpreting prompts and generating images?"

Methods to answer the Research Questions

Run the feedback loop with simple initial prompts, analyze iterations and divergent/convergent features.

Check the differences between iterations in the embedding space.

Run the loop, but instead of trying to align to an initial prompt, focus on holding first image stable (describe -> prompt -> image -> describe -> prompt -> image...).

Analyze convergent patterns and regularities.

How to evaluate Image precision?

Intuition: If a prompt says "a cheese and a mouse", the image should not contain additional specifica (e.g. mouse with clothes, a cheeseplate...)

To measure the precision, subjective manual analysis, combined with embedding analysis is used.

Related Work

About

Project Foundation Models WT23 Uni Stuttgart

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •