Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add another implementation of multiversion #446

Merged
merged 16 commits into from
Mar 27, 2024

Conversation

usamoi
Copy link
Collaborator

@usamoi usamoi commented Mar 25, 2024

I noticed that it's hard to test multiversions of a function and the code can be written incorrectly easily.

Also, dispatching manual versions is not efficient.

So I reimplemented multiversion.

Examples

// It generate x86_64/v4, x86_64/v3, x86_64/v2, aarch64/neon and fallback versions of this function
#[detect::multiversion(v4, v3, v2, neon, fallback)]
fn f() {}

#[cfg(any(target_arch = "x86_64", doc))]
#[doc(cfg(target_arch = "x86_64"))]
#[detect::target_cpu(enable = "v4")]
unsafe fn g_v4() {}

// It generate x86_64/v3, x86_64/v2, aarch64/neon and fallback versions of this function
// It takes advantage of `g_v4` as x86_64/v4 version of this function
// It exposes the fallback version with the name "g_fallback"
#[detect::multiversion(v4 = import, v3, v2, neon, fallback = export)]
fn g() {}

#[cfg(test)]
fn g_test() {
    #[cfg(target_arch = "x86_64")]
    if detect::v4::detect() {
        assert_eq!(unsafe { g_v4() }, unsafe { g_fallback() });
    } else {
        println!("skipped: v4");
    }
}

@usamoi usamoi marked this pull request as draft March 25, 2024 08:33
@usamoi usamoi marked this pull request as ready for review March 25, 2024 09:51
@usamoi usamoi requested a review from cutecutecat March 25, 2024 10:01
@usamoi usamoi marked this pull request as draft March 25, 2024 10:18
@cutecutecat
Copy link
Member

I have a question, why it's hard to test multiversions of a function? Could you explain it more detailed?

Could we use same name for different version like:

#[multiversion(targets("x86_64+avx"))]
#[cfg(test)]
fn g() {
...
}

#[multiversion(targets("aarch64+neon"))]
#[cfg(test)]
fn g() {
...
}

#[cfg(test)]
#[multiversion(targets("aarch64+neon", "x86_64+avx"))]
fn g_test() {
   g();
}

Or a flattened test:

#[multiversion(targets("x86_64+avx"))]
#[cfg(test)]
fn xxx_x86_64_avx_test() {
g();
}

#[multiversion(targets("aarch64+neon"))]
#[cfg(test)]
fn xxx_aarch64_neon_test() {
g_aarch64_neon();
}

@usamoi usamoi marked this pull request as ready for review March 25, 2024 11:07
@usamoi
Copy link
Collaborator Author

usamoi commented Mar 25, 2024

I have a question, why it's hard to test multiversions of a function? Could you explain it more detailed?

It can be written incorrectly. Some cases:

  • detect::init() is not called in the tests of svector.
  • test() is called, instead of detect() is called in the implementation of veci8.
  • Versions of a function is written everywhere and hard to track.

Could we use same name for different version

I do not understand what do you mean. detect::multiversion is for implementation, not for tests. It just forces name conversion and module level, and allows you write less code. Thus we have a standard way to write tests and view them by cargo doc.

@usamoi usamoi closed this Mar 25, 2024
@usamoi usamoi reopened this Mar 25, 2024
Signed-off-by: usamoi <[email protected]>
@usamoi
Copy link
Collaborator Author

usamoi commented Mar 26, 2024

@silver-ymz

base::vector::svecf32::sl2_v4_test (former base::vector::svecf32::test_sl2_svector) failed (https://github.com/tensorchord/pgvecto.rs/actions/runs/8434503680/job/23098303512?pr=446#step:9:608) in emulator (you can set up local environment following https://github.com/usamoi/pgvecto.rs/blob/multiversion/.github/workflows/rust.yml#L167).

Can you help me with it?

Edit:

  1. you have permissions to push commits to usamoi:multiversion
  2. this test fails with a probability

cutecutecat
cutecutecat previously approved these changes Mar 26, 2024
Copy link
Member

@cutecutecat cutecutecat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a comments for multiversion, other LGTM.

@VoVAllen
Copy link
Member

VoVAllen commented Mar 26, 2024

sparse vector should not have precision problem I think. 1e-5 should be enough, the failure might mean something is wrong

Signed-off-by: usamoi <[email protected]>
@usamoi usamoi force-pushed the multiversion branch 2 times, most recently from a9d7444 to 2da3ede Compare March 26, 2024 10:11
@usamoi usamoi force-pushed the multiversion branch 3 times, most recently from 9eba749 to 03ddcd7 Compare March 26, 2024 10:30
@silver-ymz
Copy link
Member

silver-ymz commented Mar 26, 2024

silver-ymz@5f0bac3 can fix base::vector::svecf32::sl2_v4_test. And about precision, I ran 100 times, the max difference is 2e-4.

@silver-ymz
Copy link
Member

Also, in my tests. vector::vecf16::dot_v4_avx512fp16_test failed once.

---- vector::vecf16::dot_v4_avx512fp16_test stdout ----
thread 'vector::vecf16::dot_v4_avx512fp16_test' panicked at crates/base/src/vector/vecf16.rs:229:5:
specialized = 1024, fallback = 1025.4672.

@usamoi
Copy link
Collaborator Author

usamoi commented Mar 26, 2024

silver-ymz@5f0bac3 can fix base::vector::svecf32::sl2_v4_test. And about precision, I ran 100 times, the max difference is 2e-4.

100 times evaluation on emulator shows EPS should be larger than 7.0138855.

Are you testing on a real machine? Is the difference caused by emulating?

@usamoi
Copy link
Collaborator Author

usamoi commented Mar 26, 2024

I found if I just copy the code in dot_v4 and evaluate $\Sigma x^2 + \Sigma y^2 - 2 D$, the smallest EPS in 10000 times should be 0.00032043457.

Also, in my tests. vector::vecf16::dot_v4_avx512fp16_test failed once.

All tests should be stressful to get real EPS...

Signed-off-by: usamoi <[email protected]>
@silver-ymz
Copy link
Member

Are you testing on a real machine?

Yes, I test on real machine.

Is the difference caused by emulating?

I'm not sure. Could you provide the vectors which caused large EPS to reproduce?

@usamoi
Copy link
Collaborator Author

usamoi commented Mar 26, 2024

I think I know why now. I should copy code in left zone to my editor.

@usamoi usamoi force-pushed the multiversion branch 2 times, most recently from b83a179 to 9cbd82b Compare March 27, 2024 02:45
@usamoi usamoi added this pull request to the merge queue Mar 27, 2024
Merged via the queue into tensorchord:main with commit 9766d43 Mar 27, 2024
13 checks passed
JinweiOS pushed a commit to JinweiOS/pgvecto.rs that referenced this pull request May 21, 2024
* chore: add another implementation of multiversion

Signed-off-by: usamoi <[email protected]>

* chore: update rust-toolchain

Signed-off-by: usamoi <[email protected]>

* chore: use detect::multiversion

Signed-off-by: usamoi <[email protected]>

* test: use detect::multiversion

Signed-off-by: usamoi <[email protected]>

* ci: add sde test

Signed-off-by: usamoi <[email protected]>

* test: add dot_internal_v4_avx512vnni_test

Signed-off-by: usamoi <[email protected]>

* test: bvector tests

Signed-off-by: usamoi <[email protected]>

* chore: add comments for detect

Signed-off-by: usamoi <[email protected]>

* ci: do not run rust test 3 times

Signed-off-by: usamoi <[email protected]>

* fix: svector sl2_v4

Signed-off-by: usamoi <[email protected]>

* test: run svector tests 10000 times

Signed-off-by: usamoi <[email protected]>

* fix: svecf32_sl2_v4

Signed-off-by: Mingzhuo Yin <[email protected]>

* test: run vecf16 and veci8 test for 10000 times

Signed-off-by: usamoi <[email protected]>

* test: run bvecf32 test for 10000 times

Signed-off-by: usamoi <[email protected]>

* test: run tests for 300 times to reduce ci time

Signed-off-by: usamoi <[email protected]>

* chore: update rust toolchain

Signed-off-by: usamoi <[email protected]>

---------

Signed-off-by: usamoi <[email protected]>
Signed-off-by: Mingzhuo Yin <[email protected]>
Co-authored-by: Mingzhuo Yin <[email protected]>
Signed-off-by: jinweios <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants