Is Tensile adapted to RDNA2 ? #1579

v01dXYZ · 2022-08-29T16:52:47Z

Hello,
As you may know RDNA2 has a 128MB L3 cache which is an important difference with the GCN/CDNA architecture, it allows to use efficiently a memory subsystem with a smaller bus width (although it has a throughput higher than a Vega 10) with 8 Samsung GDDR6 chips (8x32x16Gbps). Are tensile or MISA adapted to a microarchitecture where caching (ie spatial/temporal locality) is central to achieve peak performance ?
Do you think RDNA2 could be as good or even better than a GCN/CDNA architecture for GEMM by conserving as longly as possible blocks in the L3 cache ? As we have 128 MB / 160 wavefronts ~= 800 KB per wavefront (160 wavefronts = 80 CU * 2 concurrent 32-lane wavefronts per CU). It is not far away from the L2 cache we found on CPU (Ryzen 5xxx series: 512 KB L2 cache).

bragadeesh · 2022-11-01T17:11:38Z

Yes Tensile has support for RDNA2, assigning this to @TonyYHsieh for further support

ppanchad-amd · 2024-07-15T20:24:12Z

@v01dXYZ Do you still need assistance with this ticket? If not, please close the ticket. Thanks!

bragadeesh assigned TonyYHsieh Nov 1, 2022

v01dXYZ closed this as completed Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is Tensile adapted to RDNA2 ? #1579

Is Tensile adapted to RDNA2 ? #1579

v01dXYZ commented Aug 29, 2022 •

edited

Loading

bragadeesh commented Nov 1, 2022

ppanchad-amd commented Jul 15, 2024

Is Tensile adapted to RDNA2 ? #1579

Is Tensile adapted to RDNA2 ? #1579

Comments

v01dXYZ commented Aug 29, 2022 • edited Loading

bragadeesh commented Nov 1, 2022

ppanchad-amd commented Jul 15, 2024

v01dXYZ commented Aug 29, 2022 •

edited

Loading