Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move tfjs to first, average, worst scoring. #23

Merged
merged 2 commits into from
Jan 16, 2025
Merged

Conversation

kmiller68
Copy link
Contributor

Since this test is relatively slow per iteration I reduced the iteration counts to: non-simd: 15 iterations with worst 2.
simd: 80 iterations with worst 4 (default).

Also, I moved all the benchmark files to a tfjs directory to help organize the wasm directory a bit.

Since this test is relatively slow per iteration I reduced the iteration counts to:
non-simd: 15 iterations with worst 2.
simd: 80 iterations with worst 4 (default).

Also, I moved all the benchmark files to a tfjs directory to help organize the wasm directory a bit.
@danleh
Copy link
Contributor

danleh commented Jan 15, 2025

Overall this looks good to me, with some high-level comments though:

  1. Unlike the other Emscripten workloads, this doesn't contain a build script. How are the Wasm and JS files generated? Did you compile them at some point, or do they come from NPM packages? Ideally, we should be able to recompile with an up-to-date toolchain, but alternatively can we at least download them from somewhere, in case the upstream packages gets updated? in-depth.html mentions several repositories for Tensorflow.js, the Wasm backend, and models, but I am not sure how this is setup or copied together.
  2. Even with the lowered iteration counts, in particular the SIMD test is pretty slow (in the order of 30 to 40s). Could we reduce the workload per iteration or just lower iterations further, say to 30?
  3. Looking at the profile/flamegraph, a substantial number of samples are spent on parsing the .js files and an even larger amount on base64ToArrayBuffer, which is used when reading the model weights. Other folks have reported JetStream 3 initialization taking much longer than before. Could this be because of the large .js files and in particular model weights of this benchmark? Should we choose maybe just one or two models/inference tasks, not four?
  4. Given that the non-SIMD variant takes roughly twice the runtime and SIMD is widely supported by now. Should we only run the SIMD variant?

@kmiller68
Copy link
Contributor Author

r.e. your comments:

  1. Yeah setting up a build script for this is on my todo list. First I have to figure out how the test was originally built though.
  2. Interesting, for me the simd version is about ~14s (jsc) / ~16s (v8) on my M4 MBP. The non-simd is ~8/~9s respectively. What CPU are you using? That still might be a bit long so I'm happy to trim the simd iteration count into e.g. half?
  3. Let's add that to the doc and talk about it at the next sync.
  4. Yeah, I brought this up at the last sync I'm happy to trim the non-SIMD. Although, I think we should keep it until we have enough other wasm tests. At least given the plan to remove the smaller wasm benchmarks already.

This makes the runtime for the benchmark ~7s/9s/11s for jsc/v8/sm respectively on my M4 Max MBP.
@danleh
Copy link
Contributor

danleh commented Jan 16, 2025

  1. Perfect, I'll take another look with a build script then.
  2. I'm running on a server x86 CPU, very high core count but pretty low frequency/boost, that could explain it. Changing iterations to 40 for SIMD SGTM.
  3. Sure, I added it to the doc/agenda for next time.
  4. Sure, let's keep the non-SIMD for now.

@kmiller68
Copy link
Contributor Author

Ok sounds like we have a plan. Gonna merge for now.

@kmiller68 kmiller68 merged commit 3f50521 into main Jan 16, 2025
kmiller68 added a commit that referenced this pull request Jan 21, 2025
This reverts commit 3f50521, reversing
changes made to c1a122d.
kmiller68 added a commit that referenced this pull request Jan 21, 2025
Revert "Merge pull request #23 from WebKit/tfjs-to-js-scoring"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants