AVX512, default Bulk SIMD for Faiss Scala Quantized Index.#3186
Open
0ctopus13prime wants to merge 2 commits intoopensearch-project:feature/faiss-bbqfrom
Open
AVX512, default Bulk SIMD for Faiss Scala Quantized Index.#31860ctopus13prime wants to merge 2 commits intoopensearch-project:feature/faiss-bbqfrom
0ctopus13prime wants to merge 2 commits intoopensearch-project:feature/faiss-bbqfrom
Conversation
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
ad5507a to
2bc815c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The int4BitDotProduct function in both avx512_simd_similarity_function.cpp and arm_neon_simd_similarity_function.cpp used reinterpret_cast<const uint64_t*> on pointers that may not be 8-byte aligned when binaryCodeBytes % 8 != 0. This is undefined behavior per the C++ standard.
Replaced the casts with std::memcpy into local uint64_t variables. Compilers optimize this into a single mov — zero runtime cost, no UB.
The default similarity function (default_simd_similarity_function.cpp) only supported FP16. BBQ queries on platforms without AVX512 or NEON would fail with "Invalid native similarity function type".
Added the full BBQ scoring pipeline to the default implementation:
readDataCorrections — safe unaligned reads via std::memcpy
int4BitDotProduct — scalar weighted popcount dot product (with the alignment fix)
default4bitDotProductBatch<BATCH_SIZE> — batched version, pure C++, no SIMD intrinsics
DefaultBBQSimilarityFunction — full scoring struct with batch-8/batch-4/scalar-tail pattern
Wired up BBQ_IP and BBQ_L2 in selectSimilarityFunction
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.