[FEATURE] Incremental insertion to leading segment for PQ quantized graphs

**Is your feature request related to a problem?**
The feature is part of the problem of [leading segment merge](https://github.com/opensearch-project/opensearch-jvector/issues/133).
At the moment, incremental insertion into new nodes for graph construction during merges is done by choosing a leading segment and later add the new nodes into the existing graph.

However there are couple of limitations at the moment:
1. Deletes
2. Quantization

The quantization case becomes problematic especially as it is important for the construction of graphs in a RAM constrained environment.

**What solution would you like?**
We would like to see incremental graph construction with leading segment on PQ quantized graphs by employing the following technique:
1. Checkpointing PQ codebooks after each merge
2. Determine whether the codebooks centroids drift is substantial or whether the codebooks can be reused.
3. In case that the codebook drift is significant we would want to re-construct the graph.

We believe that for the most part, after the initial construction of a large graph (1B for example) the drift for addition of smaller batches in PQ codebooks will not be substantial.
And therefore the above approach works.

**What alternatives have you considered?**
NA

**Do you have any additional context?**
NA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Incremental insertion to leading segment for PQ quantized graphs #166

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Incremental insertion to leading segment for PQ quantized graphs #166

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions