-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen2-VL SFT 训练疑惑 #2853
Comments
有freeze_vit, freeze_aligner参数的 |
我的意思是 ,在全部打开的情况下,设置为全部参数都训练的时候,同时用图文混合数据+文本数据训练 那么在纯文本的数据部分,梯度会回传到 vit 上么。vit 会在这些数据上更新不? |
不更新的 |
纯文本数据的话,并没有激活vit和projector,所以vit和projector的参数不在计算图中,那么梯度并不会回传到vit上吧? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
在训练 Qwen2-VL 或者其他 MLLM 模型的时候,如果有图文混合数据和纯文本数据
求问下,在纯文本的 batch 的时候,ms-swift 会更新 Visual 模块么,VIT 或者 Projector 这两个地方的权重。。。
The text was updated successfully, but these errors were encountered: