Skip to content

[Question]: Similarity balance problem #6604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks done
zone-lou opened this issue Mar 27, 2025 · 2 comments
Open
4 tasks done

[Question]: Similarity balance problem #6604

zone-lou opened this issue Mar 27, 2025 · 2 comments
Assignees
Labels
🙋‍♀️ question Further information is requested

Comments

@zone-lou
Copy link

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

Describe your problem

When the document is parsed, the generated keywords are set. In the query, the blocks with high keyword similarity are found in the following order, and whether there is a configuration for setting keywords first.

I want the data
Image

The actual number one data
Image

Keyword similarity weight does not change the order of filling
Image

@zone-lou zone-lou added the 🙋‍♀️ question Further information is requested label Mar 27, 2025
@KevinHuSh
Copy link
Collaborator

If the keyword weight does not change the order, the embedding similarity of all the chunk must be the same.

@zone-lou
Copy link
Author

If the keyword weight does not change the order, the embedding similarity of all the chunk must be the same.如果 keyword weight 不改变顺序,则所有 chunk 的嵌入相似度必须相同。

the embedding similarity of all the chunk must be the same. What do we do with this step

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙋‍♀️ question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants