chore: add another implementation of multiversion #446

usamoi · 2024-03-25T08:21:04Z

I noticed that it's hard to test multiversions of a function and the code can be written incorrectly easily.

Also, dispatching manual versions is not efficient.

So I reimplemented multiversion.

Examples

// It generate x86_64/v4, x86_64/v3, x86_64/v2, aarch64/neon and fallback versions of this function
#[detect::multiversion(v4, v3, v2, neon, fallback)]
fn f() {}

#[cfg(any(target_arch = "x86_64", doc))]
#[doc(cfg(target_arch = "x86_64"))]
#[detect::target_cpu(enable = "v4")]
unsafe fn g_v4() {}

// It generate x86_64/v3, x86_64/v2, aarch64/neon and fallback versions of this function
// It takes advantage of `g_v4` as x86_64/v4 version of this function
// It exposes the fallback version with the name "g_fallback"
#[detect::multiversion(v4 = import, v3, v2, neon, fallback = export)]
fn g() {}

#[cfg(test)]
fn g_test() {
    #[cfg(target_arch = "x86_64")]
    if detect::v4::detect() {
        assert_eq!(unsafe { g_v4() }, unsafe { g_fallback() });
    } else {
        println!("skipped: v4");
    }
}

Signed-off-by: usamoi <[email protected]>

cutecutecat · 2024-03-25T10:51:02Z

I have a question, why it's hard to test multiversions of a function? Could you explain it more detailed?

Could we use same name for different version like:

#[multiversion(targets("x86_64+avx"))]
#[cfg(test)]
fn g() {
...
}

#[multiversion(targets("aarch64+neon"))]
#[cfg(test)]
fn g() {
...
}

#[cfg(test)]
#[multiversion(targets("aarch64+neon", "x86_64+avx"))]
fn g_test() {
   g();
}

Or a flattened test:

#[multiversion(targets("x86_64+avx"))]
#[cfg(test)]
fn xxx_x86_64_avx_test() {
g();
}

#[multiversion(targets("aarch64+neon"))]
#[cfg(test)]
fn xxx_aarch64_neon_test() {
g_aarch64_neon();
}

usamoi · 2024-03-25T11:16:00Z

I have a question, why it's hard to test multiversions of a function? Could you explain it more detailed?

It can be written incorrectly. Some cases:

detect::init() is not called in the tests of svector.
test() is called, instead of detect() is called in the implementation of veci8.
Versions of a function is written everywhere and hard to track.

Could we use same name for different version

I do not understand what do you mean. detect::multiversion is for implementation, not for tests. It just forces name conversion and module level, and allows you write less code. Thus we have a standard way to write tests and view them by cargo doc.

Signed-off-by: usamoi <[email protected]>

usamoi · 2024-03-26T08:37:56Z

@silver-ymz

base::vector::svecf32::sl2_v4_test (former base::vector::svecf32::test_sl2_svector) failed (https://github.com/tensorchord/pgvecto.rs/actions/runs/8434503680/job/23098303512?pr=446#step:9:608) in emulator (you can set up local environment following https://github.com/usamoi/pgvecto.rs/blob/multiversion/.github/workflows/rust.yml#L167).

Can you help me with it?

Edit:

you have permissions to push commits to usamoi:multiversion
this test fails with a probability

Signed-off-by: usamoi <[email protected]>

cutecutecat

Need a comments for multiversion, other LGTM.

VoVAllen · 2024-03-26T09:24:54Z

sparse vector should not have precision problem I think. 1e-5 should be enough, the failure might mean something is wrong

Signed-off-by: usamoi <[email protected]>

silver-ymz · 2024-03-26T13:14:47Z

silver-ymz@5f0bac3 can fix base::vector::svecf32::sl2_v4_test. And about precision, I ran 100 times, the max difference is 2e-4.

silver-ymz · 2024-03-26T13:16:01Z

Also, in my tests. vector::vecf16::dot_v4_avx512fp16_test failed once.

---- vector::vecf16::dot_v4_avx512fp16_test stdout ----
thread 'vector::vecf16::dot_v4_avx512fp16_test' panicked at crates/base/src/vector/vecf16.rs:229:5:
specialized = 1024, fallback = 1025.4672.

usamoi · 2024-03-26T13:30:51Z

silver-ymz@5f0bac3 can fix base::vector::svecf32::sl2_v4_test. And about precision, I ran 100 times, the max difference is 2e-4.

100 times evaluation on emulator shows EPS should be larger than 7.0138855.

Are you testing on a real machine? Is the difference caused by emulating?

usamoi · 2024-03-26T13:41:23Z

I found if I just copy the code in dot_v4 and evaluate $\Sigma x^2 + \Sigma y^2 - 2 D$, the smallest EPS in 10000 times should be 0.00032043457.

Also, in my tests. vector::vecf16::dot_v4_avx512fp16_test failed once.

All tests should be stressful to get real EPS...

Signed-off-by: usamoi <[email protected]>

silver-ymz · 2024-03-26T13:48:19Z

Are you testing on a real machine?

Yes, I test on real machine.

Is the difference caused by emulating?

I'm not sure. Could you provide the vectors which caused large EPS to reproduce?

Signed-off-by: usamoi <[email protected]>

usamoi · 2024-03-26T13:59:36Z

I think I know why now. I should copy code in left zone to my editor.

Signed-off-by: Mingzhuo Yin <[email protected]>

Signed-off-by: usamoi <[email protected]>

* chore: add another implementation of multiversion Signed-off-by: usamoi <[email protected]> * chore: update rust-toolchain Signed-off-by: usamoi <[email protected]> * chore: use detect::multiversion Signed-off-by: usamoi <[email protected]> * test: use detect::multiversion Signed-off-by: usamoi <[email protected]> * ci: add sde test Signed-off-by: usamoi <[email protected]> * test: add dot_internal_v4_avx512vnni_test Signed-off-by: usamoi <[email protected]> * test: bvector tests Signed-off-by: usamoi <[email protected]> * chore: add comments for detect Signed-off-by: usamoi <[email protected]> * ci: do not run rust test 3 times Signed-off-by: usamoi <[email protected]> * fix: svector sl2_v4 Signed-off-by: usamoi <[email protected]> * test: run svector tests 10000 times Signed-off-by: usamoi <[email protected]> * fix: svecf32_sl2_v4 Signed-off-by: Mingzhuo Yin <[email protected]> * test: run vecf16 and veci8 test for 10000 times Signed-off-by: usamoi <[email protected]> * test: run bvecf32 test for 10000 times Signed-off-by: usamoi <[email protected]> * test: run tests for 300 times to reduce ci time Signed-off-by: usamoi <[email protected]> * chore: update rust toolchain Signed-off-by: usamoi <[email protected]> --------- Signed-off-by: usamoi <[email protected]> Signed-off-by: Mingzhuo Yin <[email protected]> Co-authored-by: Mingzhuo Yin <[email protected]> Signed-off-by: jinweios <[email protected]>

chore: add another implementation of multiversion

ed1d248

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch from ff794c7 to ed1d248 Compare March 25, 2024 08:22

usamoi marked this pull request as draft March 25, 2024 08:33

usamoi added 2 commits March 25, 2024 16:45

chore: update rust-toolchain

19009d0

Signed-off-by: usamoi <[email protected]>

chore: use detect::multiversion

c930af8

Signed-off-by: usamoi <[email protected]>

usamoi marked this pull request as ready for review March 25, 2024 09:51

test: use detect::multiversion

539ab2b

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch from 5c38d5c to 539ab2b Compare March 25, 2024 09:58

usamoi requested a review from cutecutecat March 25, 2024 10:01

usamoi marked this pull request as draft March 25, 2024 10:18

usamoi marked this pull request as ready for review March 25, 2024 11:07

usamoi closed this Mar 25, 2024

usamoi reopened this Mar 25, 2024

usamoi force-pushed the multiversion branch from c1d4ef0 to eeb8863 Compare March 26, 2024 08:29

ci: add sde test

f096c96

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch from eeb8863 to f096c96 Compare March 26, 2024 08:30

test: add dot_internal_v4_avx512vnni_test

c6001e7

Signed-off-by: usamoi <[email protected]>

cutecutecat previously approved these changes Mar 26, 2024

View reviewed changes

test: bvector tests

470915b

Signed-off-by: usamoi <[email protected]>

usamoi dismissed cutecutecat’s stale review via 470915b March 26, 2024 09:31

usamoi force-pushed the multiversion branch 2 times, most recently from a9d7444 to 2da3ede Compare March 26, 2024 10:11

chore: add comments for detect

f4998e4

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch 3 times, most recently from 9eba749 to 03ddcd7 Compare March 26, 2024 10:30

ci: do not run rust test 3 times

67572c7

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch from 03ddcd7 to 67572c7 Compare March 26, 2024 10:31

fix: svector sl2_v4

f66651c

Signed-off-by: usamoi <[email protected]>

test: run svector tests 10000 times

5a2bdf5

Signed-off-by: usamoi <[email protected]>

silver-ymz and others added 4 commits March 26, 2024 22:03

fix: svecf32_sl2_v4

0e6552f

Signed-off-by: Mingzhuo Yin <[email protected]>

test: run vecf16 and veci8 test for 10000 times

bc66423

Signed-off-by: usamoi <[email protected]>

test: run bvecf32 test for 10000 times

4a360a3

Signed-off-by: usamoi <[email protected]>

test: run tests for 300 times to reduce ci time

1fc097e

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch 2 times, most recently from b83a179 to 9cbd82b Compare March 27, 2024 02:45

chore: update rust toolchain

2224d95

Signed-off-by: usamoi <[email protected]>

usamoi force-pushed the multiversion branch from 9cbd82b to 2224d95 Compare March 27, 2024 02:50

usamoi added this pull request to the merge queue Mar 27, 2024

Merged via the queue into tensorchord:main with commit 9766d43 Mar 27, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: add another implementation of multiversion #446

chore: add another implementation of multiversion #446

usamoi commented Mar 25, 2024 •

edited

Loading

cutecutecat commented Mar 25, 2024

usamoi commented Mar 25, 2024 •

edited

Loading

usamoi commented Mar 26, 2024 •

edited

Loading

cutecutecat left a comment

VoVAllen commented Mar 26, 2024 •

edited

Loading

silver-ymz commented Mar 26, 2024 •

edited

Loading

silver-ymz commented Mar 26, 2024

usamoi commented Mar 26, 2024 •

edited

Loading

usamoi commented Mar 26, 2024 •

edited

Loading

silver-ymz commented Mar 26, 2024

usamoi commented Mar 26, 2024

chore: add another implementation of multiversion #446

chore: add another implementation of multiversion #446

Conversation

usamoi commented Mar 25, 2024 • edited Loading

Examples

cutecutecat commented Mar 25, 2024

usamoi commented Mar 25, 2024 • edited Loading

usamoi commented Mar 26, 2024 • edited Loading

cutecutecat left a comment

Choose a reason for hiding this comment

VoVAllen commented Mar 26, 2024 • edited Loading

silver-ymz commented Mar 26, 2024 • edited Loading

silver-ymz commented Mar 26, 2024

usamoi commented Mar 26, 2024 • edited Loading

usamoi commented Mar 26, 2024 • edited Loading

silver-ymz commented Mar 26, 2024

usamoi commented Mar 26, 2024

usamoi commented Mar 25, 2024 •

edited

Loading

usamoi commented Mar 25, 2024 •

edited

Loading

usamoi commented Mar 26, 2024 •

edited

Loading

VoVAllen commented Mar 26, 2024 •

edited

Loading

silver-ymz commented Mar 26, 2024 •

edited

Loading

usamoi commented Mar 26, 2024 •

edited

Loading

usamoi commented Mar 26, 2024 •

edited

Loading