-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I select causal attention for retrieval embeddings when using GritLM #45
Comments
The models are all uploaded here: https://huggingface.co/collections/GritLM/gritlm-65cc1403df78d51bb89f1651 |
Thanks for your help. I found the models for CCCC WM and CCCC LT. I wonder if there is a CCCC & Mean model available and if there is severe performance degradation using the WM. I'm curious if using the CCCC & WM model is permissible for CCCC & Mean. |
There's no trained CCCC + M model because WM is better. Why do you not want to use WM? |
I am just wondering why CCCC & M models have lower performance than CCCC & WM models. Also, I have one question. |
The intuition for why WM is better than M for decoders and more comparisons are in this paper: https://arxiv.org/pdf/2202.08904 ; It's not super well-written though, sorry!
Yes if you use
I'm surprised GritLM-7B gets that much worse if you do |
Thank you for sharing your good paper :) We are experimenting with GritLM in our application, and the top-2 hit rate is around 98% using bidirectional, but when using causal, the performance is below 50%. |
Oh that's poor ; https://hf.co/GritLM/gritlm_m7_sq2048_medi2 should be even better than https://hf.co/GritLM/gritlm_m7_sq2048_medi (Table 15). Unfortunately, I didn't train a CCCC E5 model :/ I think the caching issue can be easily solved by further finetuning GritLM to make it get used to that format but I haven't had the time to try. |
Thanks for your suggestion. |
Oh https://huggingface.co/GritLM/gritlm_m7_sq2048_medi2 should be both embedding & generation. |
In our case, we want to cache queries. |
I see, let me know how it goes! |
We are experiencing the following issue and are hoping you can provide some comments or solutions: We are using the aforementioned causal/causal model and are proceeding with the following process: When performing the process as described above, the results differ between when the cache is stored and when it is not. When we input the query and documents without storing the cache, the response is accurate. However, when we input the cache and documents, the response becomes completely strange (with repetitive phrases or nonsensical output). We found that it seems that the causal/causal model uses a system prompt at the beginning unlike the released GritLM, and even after removing it for evaluation, we still get similarly strange and incorrect results. |
What do you mean by this? There should be no system prompt besides the formatting i.e. |
It seems these two outcomes should be exactly the same, but since different results are coming out, I am asking this question. When using both the query and text, the following format is utilized: FULL_FORMAT = "<|embed|>\n{query}\n<|user|>\n {text}\n\nOptionally using the prior context answer the query prior to it\n<|assistant|>\n" When forwarding the query, "<|embed|>\n" + query is included, and when forwarding the text, the format "\n<|user|>\n {text}\n\nOptionally using the prior context answer the query prior to it\n<|assistant|>\n" is utilized. It seems that both should be exactly the same, but the result of 1) comes out much better. |
Yes, that looks good to me and it should be the same. A few checks:
|
Thank you for answering.
|
|
In the paper, the ablation study about attention emb and gen is interesting.
Are these models all different models using each attention?
Can I select causal attention for both cases when using GritLM-7B?
If not, could you share the model using causal attention for both?
I selected 'cccc' and 'mean' for retrieval embeddings, but the performance was significant degraded :(
The text was updated successfully, but these errors were encountered: