enhance(pyramid): support dynamically adjusting size of subindex search parameters #1508

inabao · 2026-01-08T09:03:08Z

No description provided.

gemini-code-assist · 2026-01-08T09:03:35Z

Summary of Changes

Hello @inabao, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Pyramid indexing algorithm by introducing dynamic adjustment capabilities for search parameters, particularly ef_search, within its subindexes. This change is designed to improve the efficiency and precision of approximate nearest neighbor searches by adapting to varying data sizes. Additionally, it refines graph construction parameters, optimizes memory usage, and corrects a critical topk assignment in KNN search. The accompanying test suite has been updated to validate these enhancements and ensure the stability of the system.

Highlights

Dynamic Search Parameter Adjustment: Introduced a mechanism to dynamically calculate the ef_search parameter for subindexes based on the topk value and the number of data points. This aims to optimize search performance and accuracy across different data scales within the Pyramid index.
ODescent Graph Configuration: Enabled dynamic modification of the max_degree parameter for the ODescent graph construction process, allowing for more flexible graph building based on runtime conditions.
KnnSearch Parameter Correction: Corrected an issue in the KnnSearch function where the topk search parameter was incorrectly assigned the value of ef_search instead of the actual requested k.
Memory Management Refinements: Improved memory handling by explicitly clearing ids_ vectors in IndexNode and Pyramid::add_one_point after their initial use, reducing unnecessary memory consumption.
Test Suite Updates: Expanded test coverage to include both 'nsw' and 'odescent' graph types and adjusted the recall accuracy thresholds in several test cases from 0.96 to 0.94 to accommodate the new dynamic parameter adjustments.
Resource Pool Lambda Capture Fix: Addressed a potential issue in ResourceObjectPool by explicitly capturing this in a lambda, ensuring correct memory_usage tracking.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request enhances the pyramid index by dynamically adjusting search parameters for sub-indexes, which is a valuable improvement for performance tuning. The core logic involves a new function get_suitable_ef_search to adapt the ef value based on the sub-index size. The changes are generally well-implemented, but I've identified a critical compilation error in resource_object_pool.h that needs to be addressed. Additionally, I've suggested refactoring some magic numbers into named constants for better code clarity and improving the documentation in an example file.

gemini-code-assist · 2026-01-08T09:05:57Z

src/utils/resource_object_pool.h

            auto ptr = std::make_shared<T>(args...);
            auto value = ptr->MemoryUsage();
-            memory_usage_.fetch_add(value, std::memory_order_relaxed);
+            pool->fetch_add(value, std::memory_order_relaxed);


There seems to be a typo here. The ResourceObjectPool class does not have a fetch_add method. You probably meant to access the memory_usage_ member of the pool object. This will cause a compilation error.

Suggested change

pool->fetch_add(value, std::memory_order_relaxed);

pool->memory_usage_.fetch_add(value, std::memory_order_relaxed);

gemini-code-assist · 2026-01-08T09:05:58Z

examples/cpp/107_index_pyramid.cpp

+    // - "base_quantization_type": the base quantization codes
+    // - "precise_quantization_type": the precise quantization codes
+    // - "index_min_size": The minimum size required to build the index
+    // - "support_duplicate": support for duplicate data in the index


The comments for the new pyramid parameters are a bit inconsistent and could be more descriptive. For example, "the base quantization codes" could be clarified as "The quantization type for base vectors." Also, please ensure consistent capitalization at the beginning of each comment line for better readability.

Suggested change

// - "base_quantization_type": the base quantization codes

// - "precise_quantization_type": the precise quantization codes

// - "index_min_size": The minimum size required to build the index

// - "support_duplicate": support for duplicate data in the index

// - "base_quantization_type": The quantization type for base vectors.

// - "precise_quantization_type": The quantization type for precise vectors, used for reordering.

// - "index_min_size": The minimum number of vectors required to build a sub-index.

// - "support_duplicate": Support for duplicate data in the index.

gemini-code-assist · 2026-01-08T09:05:58Z

src/algorithm/pyramid.cpp

+get_suitable_ef_search(int64_t topk, int64_t data_num) {
+    if (data_num < 1'000) {
+        return std::max(static_cast<uint64_t>(1.5F * topk), 50UL);
+    }
+    if (data_num < 100'000) {
+        return std::max(static_cast<uint64_t>(2.0F * topk), 100UL);
+    }
+    if (data_num < 1'000'000) {
+        return std::max(static_cast<uint64_t>(3.0F * topk), 200UL);
+    }
+    return std::max(static_cast<uint64_t>(4.0F * topk), 400UL);
+}


The function get_suitable_ef_search uses several magic numbers for data size thresholds, multipliers, and minimum ef values. To improve readability and maintainability, it's better to define these as named constants.

get_suitable_ef_search(int64_t topk, int64_t data_num) { constexpr int64_t kDataNumThreshold1 = 1'000; constexpr int64_t kDataNumThreshold2 = 100'000; constexpr int64_t kDataNumThreshold3 = 1'000'000; constexpr float kMultiplier1 = 1.5F; constexpr float kMultiplier2 = 2.0F; constexpr float kMultiplier3 = 3.0F; constexpr float kMultiplier4 = 4.0F; constexpr uint64_t kMinEf1 = 50UL; constexpr uint64_t kMinEf2 = 100UL; constexpr uint64_t kMinEf3 = 200UL; constexpr uint64_t kMinEf4 = 400UL; if (data_num < kDataNumThreshold1) { return std::max(static_cast<uint64_t>(kMultiplier1 * topk), kMinEf1); } if (data_num < kDataNumThreshold2) { return std::max(static_cast<uint64_t>(kMultiplier2 * topk), kMinEf2); } if (data_num < kDataNumThreshold3) { return std::max(static_cast<uint64_t>(kMultiplier3 * topk), kMinEf3); } return std::max(static_cast<uint64_t>(kMultiplier4 * topk), kMinEf4); }

codecov · 2026-01-08T10:54:19Z

Codecov Report

❌ Patch coverage is 73.68421% with 5 lines in your changes missing coverage. Please review.

❌ Your patch check has failed because the patch coverage (73.68%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

@@            Coverage Diff             @@
##             main    #1508      +/-   ##
==========================================
+ Coverage   91.13%   91.49%   +0.36%     
==========================================
  Files         327      327              
  Lines       19146    19163      +17     
==========================================
+ Hits        17449    17534      +85     
+ Misses       1697     1629      -68

Flag	Coverage Δ
cpp	`91.49% <73.68%> (+0.36%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`85.66% <ø> (ø)`
datacell	`93.52% <ø> (+0.90%)`	⬆️
index	`91.32% <70.58%> (+0.26%)`	⬆️
simd	`100.00% <ø> (ø)`

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6fd27f6...5804ea2. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

examples/cpp/107_index_pyramid.cpp

LHT129 · 2026-01-12T05:24:17Z

src/algorithm/pyramid.cpp

+get_suitable_ef_search(int64_t topk, int64_t data_num) {
+    auto topk_float = static_cast<float>(topk);
+    if (data_num < 1'000) {
+        return std::max(static_cast<uint64_t>(1.5F * topk_float), 50UL);


mark the magic number and add "LL" at end

cannot do it

src/algorithm/pyramid.cpp

src/impl/odescent/odescent_graph_builder.h

LHT129 · 2026-01-12T05:31:08Z

tests/test_index.cpp

            auto range_result =
                index->RangeSearch(query, threshold, search_param, dataset->filter_function_);
-            REQUIRE(range_result.value()->GetDim() >= topk - 1);
+            if (range_result.value()->GetDim() < topk - 1) {


why change REQUIRE to WARN

Prevent this process from blocking the CI pipeline

wxyucs

lgtm

LHT129

LGTM

Signed-off-by: jinjiabao.jjb <[email protected]>

jiaweizone · 2026-01-16T09:38:52Z

src/algorithm/pyramid.cpp

    // add one point
    if (node->status_ == IndexNode::Status::NO_INDEX) {
        node->Init();
+        Vector<InnerIdType>(allocator_).swap(node->ids_);


use swap bacause need shrink memory ?

jiaweizone · 2026-01-16T09:50:43Z

src/algorithm/pyramid.cpp

+                std::min(modified_param.ef,
+                         get_suitable_ef_search(search_param.topk, node->graph_->TotalCount()));
+        }
+        modified_param.topk = static_cast<int64_t>(modified_param.ef);


too much type convert ...

jiaweizone · 2026-01-16T09:54:31Z

src/algorithm/pyramid.cpp

        InnerSearchParam modified_param = search_param;
        modified_param.ep = node->entry_point_;
+        if (node->level_ != 0 && search_param.search_mode == KNN_SEARCH) {
+            modified_param.ef =


We can force disable the auto opitmize ef search future ?

inabao self-assigned this Jan 8, 2026

inabao added kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) version/0.18 labels Jan 8, 2026

inabao requested review from LHT129, ShawnShawnYou, jiaweizone and wxyucs as code owners January 8, 2026 09:03

pull-request-size bot added the size/M label Jan 8, 2026

mergify bot added module/example module/testing labels Jan 8, 2026

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

LHT129 reviewed Jan 12, 2026

View reviewed changes

inabao requested a review from LHT129 January 12, 2026 09:24

inabao force-pushed the dynamic-pyramid branch 2 times, most recently from 74ac4e5 to 361a7e4 Compare January 15, 2026 11:01

wxyucs approved these changes Jan 16, 2026

View reviewed changes

LHT129 approved these changes Jan 16, 2026

View reviewed changes

inabao added 8 commits January 16, 2026 12:02

modify

aee921d

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

db2bbfd

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

bd6eb6c

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

b954f5e

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

6716ee4

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

c8fc71a

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

a965317

Signed-off-by: jinjiabao.jjb <[email protected]>

modify

5804ea2

Signed-off-by: jinjiabao.jjb <[email protected]>

inabao force-pushed the dynamic-pyramid branch from 361a7e4 to 5804ea2 Compare January 16, 2026 05:53

jiaweizone reviewed Jan 16, 2026

View reviewed changes

	pool->fetch_add(value, std::memory_order_relaxed);
	pool->memory_usage_.fetch_add(value, std::memory_order_relaxed);

enhance(pyramid): support dynamically adjusting size of subindex search parameters #1508

Are you sure you want to change the base?

enhance(pyramid): support dynamically adjusting size of subindex search parameters #1508

Conversation

inabao commented Jan 8, 2026

Uh oh!

gemini-code-assist bot commented Jan 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

LHT129 Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

inabao Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LHT129 Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

inabao Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

wxyucs left a comment

Choose a reason for hiding this comment

Uh oh!

LHT129 left a comment

Choose a reason for hiding this comment

Uh oh!

jiaweizone Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

jiaweizone Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

jiaweizone Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Jan 8, 2026 •

edited

Loading