Replies: 1 comment 1 reply
-
Huh. If you need a few more weird results: ggml_vulkan: 0 = AMD Radeon RX 7600 XT (RADV NAVI33) (radv) | uma: 0 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat 768x960 image, SDXL:
(rev e767be7 ) So... looks like I don't have that same drop on tiles/centisecond. On the other hand, my card seems to dislike odd-sized tiles. And those iteration numbers drop very abruptly at some points... possibly because the number of needed tiles changes on both axis at the same time? Artifacts are very noticeable at 14 and below, BTW. But above 17 or so, I can hardly notice anything. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
On Vulkan with RX6800 (Windows driver), x axis is the tile size:
Similar results with a RX 5700XT.
According to my tests, the optimal tile size for VAE decoding on Vulkan backend seems to be 26. But I'm really confused about the huge drop of performance that happens for tile sizes between 10 and 25 for no obvious reason. I'm not noticing the same thing on CPU so far (CPU performance seems pretty consistent, with a very slight performance adventage for smaller tile sizes that should not matter in practice)
Beta Was this translation helpful? Give feedback.
All reactions