Open
Description
Feature request
Hello! It would be awesome to have LLaVa support (upload an image to the API and have it embed it via CLIP etc)
https://github.com/haotian-liu/LLaVA
text-generation-webui already has this built in, but we need to make use of the true model parallelization for faster inference of this repository.
Thank you very much! :)
Motivation
Current LLaVa inference code does not allow for true model parallelization. We (as many others) need very fast inference with LLaVa models.
Your contribution
I am not experienced enough to do any of the foundational work but I would love to help when the foundation is there, porting code from the LLaVa repository to here.