-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang-18 built kernel v6.11 leads to desktop graphics corruption + "AMDGPU(0): amdgpu_setup_kernel_mem failed" vs. same kernel built with gcc-14 runs ok (v6.11, x86_64) #2053
Comments
None of the differences I see between the configurations would appear to cause this. There is not much information to go on here. It is entirely possible this is a code problem that just happens to show up with clang due to optimization or code generation differences. Testing with |
No suspicious output with KASAN and/or UBSAN. With KMSAN (no other sanitizers selected) the the kernel doesn't boot at all however. It gets stuck at UEFI bootscreen stating that it will load the kernel image, but after that nothing for several minutes... |
Thanks for double checking on the sanitizers. Another thing to check is if this happens with an older or newer version of LLVM, which may point to a regression.
cc @ramosian-glider, that sounds unexpected? |
I'll check with llvm 19.1.1 once it's out and report back (also on the KMSAN issue). |
Can you please share the kernel build config? |
@ramosian-glider Here's my v6.10.12 one. Same config without KMSAN boots just fine with my Ryzen 5950X. |
Did a build now with 19.1.1 and kernel v6.11.1 and can confirm the amdgpu graphics corruption is still there. Also the kernel not booting with KMSAN enabled. Just opened #2054 on that one to avoid confusion. |
Found out the issue only shows up when So the current situation is to either revert 675d6d3 or to not use v6.11.2 kernel .config attached. |
Perhaps the value assigned to |
I am surprised that GCC would not show the same problem. Maybe there is a subtle difference in the implementation of pattern initialization between clang and GCC? However, due to pattern initialization being the key trigger, I would say it is more likely that there is a bug in the kernel code (like undefined behavior) than the compiler (or maybe #2033 could be at play here, haven't looked at where |
Does initializing |
Initializing
But it didn't not help with the issue, corruption still there. |
Another user is experiencing a similar issue https://gitlab.freedesktop.org/drm/amd/-/issues/3660. He also states this problem with |
I reported the amdgpu graphics corruption issue at https://gitlab.freedesktop.org/drm/amd/-/issues/3638. I got the issue starting with kernel v6.10.10, so I bisected it to the following commit:
Turns out this only happens when I build the kernel with clang-18 but not when I build it with gcc-14. Reverting the above commit 'fixes' the clang-18 built kernel too and I get no graphics corruption. Current mainline kernel v6.11 shows the same issue on my system.
clang and gcc kernel .config attached.
config_6110_zen3_gcc14.txt
config_6110_zen3_clang18.txt
The text was updated successfully, but these errors were encountered: