Model Performance Check #779

Luodian · 2025-07-31T04:54:27Z

Luodian
Jul 31, 2025
Maintainer

Hi we open this discussion for the users of lmms-eval to have discussion about the reproductability of different models. You could post messages about any model performance mismatch with desired scores, let us know and we would try our best to assist.

tjsgh531 · 2025-10-14T23:49:10Z

tjsgh531
Oct 14, 2025

I conducted MMMU evaluation of Qwen2.5-VL-7B model, and there is a big difference in score.

The score released by open-compass/VLMEvalKit is 58.0, and the score evaluated using lmms-eval is 51.56.

Can you confirm about this?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Model Performance Check #779

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Model Performance Check #779

Uh oh!

Luodian Jul 31, 2025 Maintainer

Replies: 1 comment

Uh oh!

tjsgh531 Oct 14, 2025

Luodian
Jul 31, 2025
Maintainer

tjsgh531
Oct 14, 2025