-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate GritLM-7B on MTEB datasets #57
Comments
This is on purpose & happens here gritlm/evaluation/eval_mteb.py Line 1177 in 7c06435
It only evaluates the 56 main MTEB EN datasets & skips others The warnings are fine |
Thank you very much! I noticed that only 8 tasks were evaluated, with 6 of them being classification tasks and 2 being STS tasks. I'd like to evaluate GritLM-7B on all the tasks mentioned in the paper and compare the results. Could you please guide me on how to proceed with that? |
Oh sorry it seems like the latest version of MTEB had some changes which render the eval script in this repository outdated. I just changed the requirements of the repo to install a different mteb version here: #58 - Can you try downgrading your mteb to the version in that PR ( (if you want to use the latest mteb it should also work via sth like the below # !pip install mteb gritlm
import mteb
model_name = "GritLM/GritLM-7B"
revision = "13f00a0e36500c80ce12870ea513846a066004af"
model = mteb.get_model(model_name, revision=revision)
benchmark = mteb.get_benchmark("MTEB(eng, classic)")
evaluation = mteb.MTEB(tasks=benchmark)
results = evaluation.run(model) ) |
It begins to evaluate on other datasets! thanks! Also, I'd like to know is it sufficient to evaluate MTEB on a single A100-80GB GPU? |
I think that is sufficient, it will just take a while (especially the retrieval datasets). |
Hi! The evaluation proceeds fine until the
|
Looks like a corrupted download; You can try deleting |
I've tried to clean the cache but the error persists. I found a closed issue in MTEB repo and it shares the same problem. Do I need to downgrade Thank you so much! |
Hm yeah maybe try dwongrading |
I am trying to evaluate GritLM-7B on MTEB datasets using the provided script.
However, it seems that it has only been evaluated on the following datasets:
AmazonCounterFactualClassification
AmazonReviewsClassification
MassiveIntentClassification
MassiveScenarioClassification
MTOPDomainClassification
MTOPIntentClassification
STS17
STS22
Other datasets seem to be skipped. The output log is shown here:
And the error log contains some warning such as:
I will really appreciate it if you could help me with that! Thank you so much!
The text was updated successfully, but these errors were encountered: