New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: support for multimodal models #259

Open

massi-ang opened this issue Apr 12, 2024 · 1 comment

Contributor

massi-ang commented Apr 12, 2024

Testing of multimodal capable models like Claude 3 and Idefics requires new input nodes that support images to be fed into the prompt nodes.

Owner

ianarawjo commented Apr 15, 2024

Yeah, this is in the roadmap for sure. I don't have much time lately, so, if anyone is reading this and wants to take a shot at it, please do!

To challenges from a Design standpoint:

how to add images as options to existing nodes without cluttering the interface.
how to consider images inside Tabular Data (i.e., can we load images inside spreadsheets? Is there a standard image database format? Can we autodetect local URLs and fetch the image? etc)

One might argue we just create a new node, MultiModalFields or something, but this might just clutter things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment