-
Notifications
You must be signed in to change notification settings - Fork 80
test: add search unrelated parameter tests #1455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 0.16
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @wxyucs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the robustness of the search parameter parsing across various vector search indices. It introduces new test infrastructure and specific test cases for DiskANN, HGraph, HNSW, and SINDI to ensure that their search functionalities can handle and ignore irrelevant parameters within the search configuration. This prevents potential errors or unexpected behavior when users provide extraneous information, thereby improving the overall stability and fault tolerance of the search system. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces tests to ensure that search operations handle unrelated parameters gracefully, which is a valuable addition for API robustness. The refactoring of test fixtures in test_diskann.cpp and test_hnsw.cpp to centralize test resource management is also a good improvement. I've identified a few minor issues, including typos in test names and JSON keys, a copy-paste error in test tags, and some redundant loops in the new tests that could be optimized. Addressing these points will enhance the quality and maintainability of the test suite.
tests/test_diskann.cpp
Outdated
| "ef_search": 200, | ||
| "io_limit": 200, | ||
| "beam_search": 4, | ||
| "-------unreleated parameters below-------": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for (auto& [base_quantization_str, recall] : resource->test_cases) { | ||
| INFO(fmt::format("metric_type: {}, dim: {}, base_quantization_str: {}, recall: {}", | ||
| metric_type, | ||
| dim, | ||
| base_quantization_str, | ||
| recall)); | ||
| vsag::Options::Instance().set_block_size_limit(size); | ||
| auto param = | ||
| DiskANNTestIndex::GenerateDiskANNBuildParametersString(metric_type, dim); | ||
| auto index = TestIndex::TestFactory(test_index->name, param, true); | ||
| auto dataset = DiskANNTestIndex::pool.GetDatasetAndCreate( | ||
| dim, resource->base_count, metric_type); | ||
| TestIndex::TestBuildIndex(index, dataset, true); | ||
| TestIndex::TestSearchUnrelatedParameter(index, dataset, search_param); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop over resource->test_cases is redundant because the variables base_quantization_str and recall are only used for logging. The GenerateDiskANNBuildParametersString function doesn't use base_quantization_str, so the test logic inside the loop is identical for each iteration. To avoid redundant test executions, especially if more test cases are added later, consider moving the test logic outside of this loop.
tests/test_hgraph.cpp
Outdated
| constexpr const char* search_param = R"({ | ||
| "hgraph": { | ||
| "ef_search": 200, | ||
| "-------unreleated parameters below-------": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/test_hgraph.cpp
Outdated
|
|
||
| TEST_CASE_PERSISTENT_FIXTURE(fixtures::HGraphTestIndex, | ||
| "(PR) HGraph SearchUnrelatedParameter", | ||
| "[ft][diskann]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
tests/test_hgraph.cpp
Outdated
|
|
||
| TEST_CASE_PERSISTENT_FIXTURE(fixtures::HGraphTestIndex, | ||
| "(Daily) HGraph SearchUnrelatedParameter", | ||
| "[ft][diskann]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
tests/test_hnsw.cpp
Outdated
| constexpr const char* search_param = R"({ | ||
| "hnsw": { | ||
| "ef_search": 200, | ||
| "-------unreleated parameters below-------": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for (auto& [base_quantization_str, recall] : resource->test_cases) { | ||
| INFO(fmt::format("metric_type: {}, dim: {}, base_quantization_str: {}, recall: {}", | ||
| metric_type, | ||
| dim, | ||
| base_quantization_str, | ||
| recall)); | ||
| vsag::Options::Instance().set_block_size_limit(size); | ||
| auto param = HNSWTestIndex::GenerateHNSWBuildParametersString(metric_type, dim); | ||
| auto index = TestIndex::TestFactory(test_index->name, param, true); | ||
| auto dataset = | ||
| HNSWTestIndex::pool.GetDatasetAndCreate(dim, resource->base_count, metric_type); | ||
| TestIndex::TestBuildIndex(index, dataset, true); | ||
| TestIndex::TestSearchUnrelatedParameter(index, dataset, search_param); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop over resource->test_cases is redundant. The variables base_quantization_str and recall are only used for logging, and GenerateHNSWBuildParametersString doesn't use base_quantization_str. This means the test logic inside the loop is the same for each iteration. Consider moving the test logic outside of this loop to avoid redundant test runs.
| } | ||
|
|
||
| TEST_CASE_PERSISTENT_FIXTURE(fixtures::SINDITestIndex, | ||
| "SINDI Search Unreleated Param", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/test_sindi.cpp
Outdated
| "n_candidate": 20, | ||
| "query_prune_ratio": 0.0, | ||
| "term_prune_ratio": 0.0, | ||
| "-------unreleated parameters below-------": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8dcc5e1 to
961f021
Compare
| TEST_CASE_PERSISTENT_FIXTURE(fixtures::DiskANNTestIndex, | ||
| "(PR) DiskANN SearchUnrelatedParameter", | ||
| "[ft][diskann]") { | ||
| auto test_index = std::make_shared<DiskANNTestIndex>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need pr tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| "[ft][diskann]") { | ||
| auto test_index = std::make_shared<DiskANNTestIndex>(); | ||
| auto resource = test_index->GetResource(false); | ||
| TestDiskANNSearchUnrelatedParameter(test_index, resource); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need daily tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tests/test_hgraph.cpp
Outdated
|
|
||
| TEST_CASE_PERSISTENT_FIXTURE(fixtures::HGraphTestIndex, | ||
| "(PR) HGraph SearchUnrelatedParameter", | ||
| "[ft][diskann]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
tests/test_hgraph.cpp
Outdated
|
|
||
| TEST_CASE_PERSISTENT_FIXTURE(fixtures::HGraphTestIndex, | ||
| "(Daily) HGraph SearchUnrelatedParameter", | ||
| "[ft][diskann]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
961f021 to
a2b1514
Compare
LHT129
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## 0.16 #1455 +/- ##
==========================================
- Coverage 92.23% 92.23% -0.01%
==========================================
Files 295 295
Lines 15660 15659 -1
==========================================
- Hits 14444 14443 -1
Misses 1216 1216
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
9710678 to
b5c06dd
Compare
3b3b54f to
ba982a8
Compare
Signed-off-by: Xiangyu Wang <[email protected]>
ba982a8 to
88a4dde
Compare
No description provided.