I can not reproduce the result JMMMU evaluation result of SakanaAI/Llama-3-EvoVLM-JP-v2 #449

CHENSSR · 2024-12-09T03:10:16Z

May I ask what generation parameters you used for JMMMU evaluation result of SakanaAI/Llama-3-EvoVLM-JP-v2 except from 'A maximum output length is set to 1,024 and a temperature is set to 0 for all models during inference.'? The best accuracy I can get so far has only 28.48%. Thank you!

Groups	Version	Filter	Metric		Value		Stderr
jmmmu_all	N/A	none	jmmmu_acc	↑	0.2848	±	N/A
- culture_agnostic	N/A	none	jmmmu_acc	↑	0.2889	±	N/A
- culture_specific	N/A	none	jmmmu_acc	↑	0.2800	±	N/A

kcz358 · 2024-12-25T00:42:02Z

I think for this you might need to contact the author of the JMMMU or EvoVLM-JP-v2 and see if they have some advise on this

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I can not reproduce the result JMMMU evaluation result of SakanaAI/Llama-3-EvoVLM-JP-v2 #449

I can not reproduce the result JMMMU evaluation result of SakanaAI/Llama-3-EvoVLM-JP-v2 #449

CHENSSR commented Dec 9, 2024

kcz358 commented Dec 25, 2024

I can not reproduce the result JMMMU evaluation result of SakanaAI/Llama-3-EvoVLM-JP-v2 #449

I can not reproduce the result JMMMU evaluation result of SakanaAI/Llama-3-EvoVLM-JP-v2 #449

Comments

CHENSSR commented Dec 9, 2024

kcz358 commented Dec 25, 2024