-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement AnglE loss #2437
Comments
I will get this out for review in the coming days! AnglE loss is CoSENT loss with a novel pairwise similarity metric. In the proposed code for CoSENT loss I allow for the choice of similarity metric in the constructor. In my head, it might make sense to implement the AnglE metric calculation in utils, where a user can specify it as a metric instead of the default pairwise_cos_sim when using CoSENT loss. What do others think of that? |
If that similarity metric has a similar signature as e.g. pairwise cosine similarity, then it may make sense to discuss this with @ir2718 who is planning on creating a new Enum with most/all similarity measures in #2441. (Edit: I see now that the signature is exactly like the pairwise cosine similarity! As far as I'm concerned, feel free to add it to I would be in favor of this approach. We can then also introduce an AnglELoss class which is just a subclass of CoSENT, but with the similarity measure fixed to the pairwise similarity metric. How does that sound?
|
Sounds great Tom! I'll get started on that soon |
The angle difference in util.pairwise_angle_sim is implemented as follows,
But in the paper, it actually is represented as I'm attaching a screenshot from the paper as well. I might be understanding it incorrectly, and would appreciate some clarification. Thanks in advance :) @tomaarsen @johneckberg |
Hello @dawnik17. Apologies for the delay.
should be scores = scores[None, :] - scores[:, None] to match the paper? If so, the current implementation does give notably better performance when training a simple model.
|
Hello!
Context
This issue describes a feature that I am planning to be included in a release before v3, or alternatively, in v3 of Sentence Transformers.
Details
Some recent work by Li & Li, 2023 introduces an AnglE objective. Its inclusion in Sentence Transformers would be beneficial to allow our users to also train using this new loss function. The first author, @SeanLee97, has implemented the objective in https://github.com/SeanLee97/AnglE.
@johneckberg has expressed an interest in contributing this loss.
cc @bwanglzu @ir2718 @johneckberg @aamir-s18 as I know you're interested in my TODO list. John will lead this development, and I will review his PR. Additionally, @SeanLee97 may also have a look if he finds the time.
The text was updated successfully, but these errors were encountered: