You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have tried using LoRA to fine-tune the gradients of the LLM, and it does lead to a slight performance improvement, but not a substantial one. Perhaps more data is needed to fully unlock its potential. It’s also possible that the distribution of image captions might disrupt the original capabilities of the LLM. Therefore, I believe this requires some new design efforts. We will disclose more related experiments in our next version of the work. If you have tried similar approaches, we would be happy to exchange ideas and share experiences.
when LLM2CLIP finetue, do you try unfreeze the LLM gradients and ViT model simultaneously,do you think this method may generate better results?
The text was updated successfully, but these errors were encountered: