Skip to content

[FEATURE] Integrate Optimized Scaler Quantized of Lucene to improve recall for 4 and 7 bit #3136

@navneet1v

Description

@navneet1v

Description

As part of this PR: apache/lucene#15169 Lucene added Optimized Scaler Quantizer which provides better recall over the previous Scaler Quantizer. and as part of this PR: apache/lucene#15064 new Lucene104ScalarQuantizedVectorsFormat started using it.

The new Scaler Quantizer also support different other quantization like 1bit, 2 bit, 8bit etc. Ref: https://github.com/apache/lucene/blob/fab626791b234f1dffb0fc67785c5465bbece167/lucene/core/src/java/org/apache/lucene/codecs/lucene104/Lucene104ScalarQuantizedVectorsFormat.java#L119

Results for 4 bit and 8 bit quantizter, directly copied from apache/lucene#15169

luceneutil benchmark results. OSQ(Optimized Scaler Quantizer) results are bits -4 and -8.

Results:

recall  latency(ms)  netCPU  avgCpuCount     nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.875        0.858   0.855        0.996  1000000    10     100       32        250    -4 bits    203.69       4909.49          168.11             1         3349.44      3311.157      381.470       HNSW
 0.954        1.222   1.217        0.996  1000000    10     100       32        250    -8 bits    333.87       2995.20          193.90             1         3717.10      3677.368      747.681       HNSW
 0.450        1.881   1.816        0.965  1000000    10     100       32        250     4 bits    510.04       1960.65          285.58             1         3346.30      3299.713      370.026       HNSW
 0.928        1.245   1.241        0.997  1000000    10     100       32        250     8 bits    325.14       3075.58          207.41             1         3705.70      3665.924      736.237       HNSW

Some Caveats:

  1. New OSQ, doesn't support confidence interval, compress flag. So we need to deprecate those mappings, which is good as it will make the interface more simple.
  2. To support older indices currently we need to copy the format which was removed, but it can be removed on a major version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeaturesIntroduces a new unit of functionality that satisfies a requirementenhancement

    Projects

    Status

    Backlog (Hot)

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions