Skip to content

Add AMD Ryzen Threadripper 7000 series #1095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 15, 2025
Merged

Conversation

melroy89
Copy link
Contributor

@melroy89 melroy89 commented Jan 11, 2025

  • Adding AMD Ryzen Threadripper 7000 series to the drop-down selection option.
  • Use 10 TFLOPS for this Threadripper generation.
  • Update comment to explain we use FP32 results for CPUs

And it seems there might be no FP16 or FP32 measurements done yet on either cpu-monkey.com or techpowerup.com regarding the Threadripper 7000 serie CPUs.

And are like 7 or 8 CPUs in this 7000 series, so I would pick a AMD Ryzen Threadripper 7970X or AMD Ryzen Threadripper 7980X as a generic CPU for the whole 7000 serie.
Then again, I have no clue what the tflops would be.

I personally own an AMD Ryzen Threadripper 7960X. Could I do some benchmark myself? If so, which tool or calculation (stress) test do you advice to give us a good indication?

@pcuenca
Copy link
Member

pcuenca commented Jan 13, 2025

Hi @melroy89! In general, we try to use representative numbers for the way the hardware is typically used or benchmarked, which usually means fp16 for GPU and mostly fp32 for CPUs. For CPU families, we talked in this discussion about using numbers between the min and the max.

But it does seem complicated in this case, I could not find much information either. I saw 190,950 MOps/Sec of "floating path math" here, but I don't know if that benchmark is comparable to the specs published in techpowerup and other places. I hope you get luckier and can find some info somewhere :)

@melroy89
Copy link
Contributor Author

I literally own this threadripper CPU, so if you let me know what benchmark to run, I know at least the min value of fp32.

@pcuenca
Copy link
Member

pcuenca commented Jan 14, 2025

I literally own this threadripper CPU, so if you let me know what benchmark to run, I know at least the min value of fp32.

Hi @melroy89, we don't run benchmarks, we retrieve numbers reported by trustworthy resources. I believe in most cases they are theoretical figures calculated taken into account the frequency, number of cores, number of floating ops per clock cycle, etc., but I haven't looked at the exact process yet.

@melroy89
Copy link
Contributor Author

melroy89 commented Jan 14, 2025

Well normally you get these numbers by running benchmarks, just like your "trusted resources" are using benchmarks to get the numbers.

Here I found a number of the top-tier Threadripper 7000 in TFLOPs of FP32:

The AIDA64 GPGPU benchmark sees the Ryzen Threadripper PRO 7995WX processor spitting out 12.16 TFLOPs of FP32 compute performance

As you can see, here they use a benchmark tool called "AIDA64 GPGPU benchmark" under Windows.

Despite the name, it also tests the CPU TFLOPS.


Read more: https://www.tweaktown.com/news/93928/amds-new-ryzen-threadripper-pro-7995wx-cpu-has-more-fp32-perf-than-xbox-series-and-ps5/index.html

@melroy89
Copy link
Contributor Author

melroy89 commented Jan 14, 2025

For now I think the best number I come up with is around 10 TFLOPS actually for FP32 on Threadripper 7000 series. So I'm below the high-end CPU results (max value) of 12.16 TFLOPS.

I also updated the comment to explain we use FP32 for CPUs.

@melroy89 melroy89 marked this pull request as ready for review January 14, 2025 16:12
@pcuenca
Copy link
Member

pcuenca commented Jan 14, 2025

Hi @melroy89, sorry if I caused confusion, I meant that in many cases these aren't really measured performance numbers, but theoretical ones based on specs (see below). I'm happy to accept the figures you got from your real-life benchmark, and grateful that you took the time to run it!


As for how specs can translate to teraflops, consider a 3090 GPU. From the specs sheet we see that it has a "boost" clock frequency of 1695 Mhz and 10496 parallel "shading" units, or cuda cores. According to this page, the Ampere cards can run 2 FP32 operations per clock cycle using the PTX instruction set, so we get 1.695 * 10496 * 2 = 35.58 tflops, same as what appears in the techpowerup sheet. I don't know why the fp16 performance is the same.

Consider now a CPU, the Intel® Core™ i9-7980XE Extreme Edition. It's "Max Turbo Frequency" is 4.20 GHz, and it has 18 parallel cores. It has two AVX-512 fused-multiply-add (FMA) units that work on 512-bit numbers. The two units run in parallel, but each one takes two cycles to compute an FMA instruction (see the throughput number for Skylake-X, which is 0.5). AVX-512 is capable of computing one 512-bit operation, or 16 (512/32) "packed" fp32 operations. Putting it together, we have a theoretical performance of 4.20 * 18 * 512 * 2 * 0.5 / 32 = 1.2 tflops, which is close to the 1.3 tflops reported here (at 4.3 GHz instead of 4.2).

So essentially these numbers come from the manufacturers and not from real-world benchmarks that use more operations than just FMA, reflect overhead, thermal throttling, etc. But I have no idea how to track down these numbers for the threadrippers and be sure that the calculations are correct, so happy to accept your measures!

@melroy89
Copy link
Contributor Author

Fixed typos

@julien-c julien-c merged commit c999114 into huggingface:main Jan 15, 2025
@melroy89 melroy89 deleted the patch-1 branch January 15, 2025 15:45
aykutkardas pushed a commit to gokayfem/huggingface.js that referenced this pull request Jan 20, 2025
- Adding AMD Ryzen Threadripper 7000 series to the drop-down selection
option.
- Use 10 TFLOPS for this Threadripper generation.
- Update comment to explain we use FP32 results for CPUs

---

And it seems there might be no FP16 or FP32 measurements done yet on
either cpu-monkey.com or techpowerup.com regarding the Threadripper 7000
serie CPUs.

And are like 7 or 8 CPUs in this 7000 series, so I would pick a AMD
Ryzen Threadripper 7970X or AMD Ryzen Threadripper 7980X as a generic
CPU for the whole 7000 serie.
Then again, I have no clue what the tflops would be.


I personally own an AMD Ryzen Threadripper 7960X. Could I do some
benchmark myself? If so, which tool or calculation (stress) test do you
advice to give us a good indication?

---------

Co-authored-by: Julien Chaumond <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants