Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ANN support? #7

Open
astoilkov opened this issue Jun 17, 2023 · 2 comments
Open

ANN support? #7

astoilkov opened this issue Jun 17, 2023 · 2 comments

Comments

@astoilkov
Copy link

I see the implementation uses cosine similarity. Performance gains come from normalizing the embeddings and caching them.

Have you considered ANN? I guess something like https://github.com/DanielKRing1/Annoy.js?

@MentalGear
Copy link

Interesting, how would you implement this?
Are there possible drawbacks, for example where recall range is exchanged for speed ?

@astoilkov
Copy link
Author

Interesting, how would you implement this?

I think the most popular way to implement it is using Approximate K Nearest Neighbor. However, I should note that I'm not knowledgeable in that area.

Are there possible drawbacks, for example where recall range is exchanged for speed ?

Yes, the algorithm makes such a tradeoff — a little less accurate for a massive speed bump when the dataset is large. This is what you can expect from commercial vector databases (example: https://supabase.com/vector).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants