-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EVAL] Add TUMLU benchmark #577
Comments
cc @hynky1999 could interest you I feel! |
Is the dataset already on Hugging Face? |
@clefourrier Not really (in gated repos), but everything is in github already. |
Gated sounds fine, can you share the path? |
Hi, I think it would be very nice addition, we already have TurkishMMLU (which I think is is also part of your dataset right ?)
To add it we would need following:
Do you think you could do that? cc @gaydmi |
@gaydmi Thank you for bringing this up! @hynky1999 I have a question. Our dataset can be split into subsets in three ways: (a) make each language a subset, (b) make each subject a subset, (c) make each language-subject combination a subset. Which one would you suggest? I could not find any similar examples in the repo. |
@hynky1999 Hi, yes, working on it! |
I would say ideally use subset for languages and then add column to identify the actuall task subset. You can then use hf_filter arg on task |
Both options sound good to me. I have added the dataset to Hugging Face: https://huggingface.co/datasets/jafarisbarov/TUMLU-mini @gaydmi let me know if I can help in any other way. |
Awesome, cc @gaydmi happy to review the PR once ready |
Hello!
We just released the benchmark for Turkic languages. Does it make sense if I add it to lighteval?
Evaluation short description
Why is this evaluation interesting?
First native-language MMLU benchmark for low-resource Turkic languages.
How is it used in the community?
Just released, MC high-school exam questions
Evaluation metadata
Provide all available
The text was updated successfully, but these errors were encountered: