[Feature Request] Cache and manage the embeddings in a persistent storage

### Context / Scenario

This post is to dive deeper into this PR for the related topic: https://github.com/microsoft/kernel-memory/pull/389

### The problem

The problem is simple: we want to avoid calling the embedding API as much as possible since it is often slow and expensive.
One quick and cheap solution is to cache the embeddings by the content hash and see if there is any chance for the collision to happen when feeding the KM with a large documentation or multiple ones with repeated content (that's all above PR is all about).

BUT, I don't think this is an ideal solution for real world scenarios. Why? Because:
1. We don't get repeated text or paragraphs often in most of the cases.
2. Above PR only benefits in the scope of current document(s) ingestion.

Let's skip the first one and go straight into the second scenario:

There are lots of cases where we want to update the existing document(s) or re-ingest them as content getting refreshed or updated, either it is a text document or a web page. In both cases, most of the content remain the same but embedding will happen again and again even if you re-import them using the same document id. This is a scenario I believe where a persistent embedding cache storage is needed for improving the speed and reducing the cost of continuously ingested documents.


### Proposed solution

In addition to the FileStorageDb and MemoryDb for the vectors and text, we could have another abstraction + implementation for the EmbeddingsCacheDb where it can be configured and used by the GenerateEmbeddingsHandler to avoid re-generating the embeddings for the same partitioned content over time across workers. Ideally storing the content hash in a distributed cache storage like Redis and storing the associated embeddings in a blob storage to work across multiple workers.

We might just need to re-design or update the way how we store the embeddings to make sure it is easy to find if the embedding already exists for the given content hash, so we don't need to store them twice. Ideally just an additional hash mapping of the two is needed or maybe we include the hash in the entity name itself etc.

User should be able to:
- Customize the storage type and location of this cache.
- Control the behavior of this cache thru config (a maximum storage limit etc).
- Violate the cache by certain policy (Ex: all embeddings cache associated with a given document should be removed when the document is deleted with a specified document id or Index)

### Importance

would be great to have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Cache and manage the embeddings in a persistent storage #390

Context / Scenario

The problem

Proposed solution

Importance

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Cache and manage the embeddings in a persistent storage #390

Description

Context / Scenario

The problem

Proposed solution

Importance

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions