-
Notifications
You must be signed in to change notification settings - Fork 443
Description
Describe the bug
Hello ! With Sentence Transformers v5, the default prompt initialization creates empty prompts that cause MTEB to reject valid model prompts.
Root Cause:
In ST v5, models are initialized with default empty prompts (for better consistency with update include in the release) as you can see here :
self.prompts = {"query": "", "document": ""}
if prompts:
self.prompts.update(prompts)
The Problem:
Even when a model has valid prompts (i.e. task specific), the empty default prompts (""
) make the prompt to have now necessary this two keys empty or not. This means self.prompts
might contain a mix of valid prompts, empty ones and ones with wrong format for MTEB.
When MTEB validates these prompts, it calls self.validate_task_to_prompt_name(self.model.prompts)
. This appears to trigger a KeyError
exception, which is caught in this code block :
mteb/mteb/models/sentence_transformer_wrapper.py
Lines 46 to 51 in cfa27d7
try: | |
model_prompts = self.validate_task_to_prompt_name(self.model.prompts) | |
except KeyError: | |
model_prompts = None | |
logger.warning( | |
"Model prompts are not in the expected format. Ignoring them." |
The KeyError
exception causes MTEB to set model_prompts = None
and log the warning message: "Model prompts are not in the expected format. Ignoring them."
So what we have right now is :
- No prompts in config → No warning message
- Wrong format prompts in config → Warning message "ignoring prompts"
And what i think should be nice to have is :
- If only empty default prompts are present (
{"query": "", "document": ""}
) => Ignore them silently without raising the warning message - If there are prompts with valid formats => Use them for evaluation
- If there are generic prompts for "query" and "document" in the config but no task-specific prompts ) => Use the generic ones as defaults
This approach would allow MTEB to handle ST v5's initialization pattern while still utilizing any valid prompts that are available. I'm not really aware of the bigger picture of the repo so i might miss some complications in my solution.
Thanks !
- Arthur BRESNU
To reproduce
Having Latest versions of MTEB and Sentence Transformers v5 with a model containing task-specific prompts.
Additional information
No response
Are you interested to contribute a fix for this bug?
No