Skip to content

Conversation

@tteofili
Copy link
Contributor

@tteofili tteofili commented Nov 26, 2025

This introduces an extension of Lucene's HnswQueueSaturationCollector that avoids any static parameters for patience and saturation threshold.
HnswQueueSaturationCollector patience parameter depends on the k param, which is also manipulated by our query API, because of num_candidates, making one such static param less controllable.
Instead of a static queue saturation and patience setting, this collector accumulates a smoothed discovery rate and an adaptive saturation threshold based on discovery rate mean and stdDev.
This is likely to work better with different doc to doc and query to vector distributions.

@tteofili
Copy link
Contributor Author

Buildkite benchmark this with so-vector please

@tteofili
Copy link
Contributor Author

baseline (early_termination=false)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.47              5.67           3.86  680.27    0.97  40448.13                1.00           true                 3.00
fiqa-en.docs        hnsw                0.000         0.25              0.67           2.62  3937.01    0.93  8683.86                1.00           true                 3.00
corpus-quora-E5-small.fvec        hnsw                0.000         1.14              4.16           3.65  877.19    0.97  35952.75                1.00           true                 3.00

baseline (early_termination=true with Lucene's defaults)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.50              5.67           3.78  666.67    0.97  40448.13                1.00           true                 3.00
fiqa-en.docs        hnsw                0.000         0.26              0.71           2.74  3846.15    0.93  8608.64                1.00           true                 3.00
corpus-quora-E5-small.fvec        hnsw                0.000         1.21              4.33           3.58  826.45    0.97  35952.75                1.00           true                 3.00

baseline (early_termination=true with p=max(7,k*0.1), s=0.995)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         1.26              4.55           3.61  793.65    0.97  31300.27                1.00           true                 3.00
fiqa-en.docs        hnsw                0.000         0.23              0.59           2.54  4310.34    0.93  7431.86                1.00           true                 3.00
corpus-quora-E5-small.fvec        hnsw                0.000         1.04              3.56           3.42  961.54    0.97  27188.93                1.00           true                 3.00

candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall   visited  filter_selectivity  filter_cached  oversampling_factor
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  --------  ------------------  -------------  -------------------  
wiki1024en.docs        hnsw                0.000         0.85              2.44           2.87  1176.47    0.96  16735.25                1.00           true                 3.00
fiqa-en.docs        hnsw                0.000         0.21              0.52           2.47  4761.90    0.92  6697.43                1.00           true                 3.00
corpus-quora-E5-small.fvec        hnsw                0.000         0.79              2.34           2.96  1265.82    0.96  14050.15                1.00           true                 3.00

the candidate is much faster and more lightweight (much less visited) than the current collector with tweaked params, although sometimes with 1% recall loss (with 3x oversampling).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants