Replies: 1 comment
-
I conducted MMMU evaluation of Qwen2.5-VL-7B model, and there is a big difference in score. The score released by open-compass/VLMEvalKit is 58.0, and the score evaluated using lmms-eval is 51.56. Can you confirm about this? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi we open this discussion for the users of lmms-eval to have discussion about the reproductability of different models. You could post messages about any model performance mismatch with desired scores, let us know and we would try our best to assist.
Beta Was this translation helpful? Give feedback.
All reactions