-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Build fails on aarch64 with LTO #8148
Comments
Hi @vifino , Please try rebuilding with your mcpu/march args set to |
Hey @kareem-wolfssl, it probably would compile, except then it would require the crypto extensions, no? |
If you are trying to build for a system without crypto ASM extensions, you will need to remove asm from your --enable-sp line and add --disable-asm. |
To be clear: Building 5.7.2 for Disabling all of the previous optimized implementations because of a new implementation does not seem like a good solution to me but rather like a big regression. Please gate the part requiring |
Hi @vifino , I think the difference between the versions is that we try and enable --enable-armasm by default. Can you please try with Thanks, |
Hey @dgarske. Yes, the code now compiles without error. I find the naming of the What options would yield the fastest code on aarch64 |
Hi @vifino , Yes for the aarch64 the As of right now the only way to use the aarch64 speedups is with the crypto extensions, however I believe it is possible to enable the aarch32 assembly speedups, which would still be significantly faster than the C implementation. Which algorithms are you interested in improving performance for? Based on that feedback I can make a recommendation for Raspberry Pi 4 build options to use. Thanks, |
Hey @dgarske, I maintain the wolfSSL derivation for NixOS and would like to ship the most optimal general configuration for our users. No specific benchmarks I'm tuning for, but we're not storage-bound. For For aarch64, our target is plain |
Hi @vifino, Unfortunately we don't support assembly for Aarch64 armv8-a. That is, there is only assembly when the crypto extensions are available. Is it possible to have different builds for different versions of ARMv8-A? Sean |
Hey @SparkiDev, it's complicated. While there is the possibility of building the Nix derivations targeting armv8-a+crypto or similar, those are rarely used on anything but Darwin/Apple Silicon, as the supported linux target is plain So, realistically, the answer is "no" - 99% of |
Hi @vifino, I understand. I'll let you know when I have a PR for this work. Sean |
Hi @vifino The PR has been merged. Let us know id this fixes your issue. Thanks, |
Hey @SparkiDev, apologies for the late response. I've tried the current master now that #8293 is merged. I appreciate the effort you all put in! Problem is, I'm running into the same build failure with LTO as before. I've tried with How can I get the linker to not complain without gcc emitting the instructions on its own? |
Hi @vifino , Did you try adding CFLAGS=-DWOLFSSL_ARMASM_NO_HW_CRYPTO to your configure also? It will disable the ARM crypto assembly instructions. Thanks, |
Hi @dgarske, I did not, but that seems to be missing most of the potential of that patch. Given that it added CPUID gating of the instructions, that's what I'm aiming for, as that's the most ideal situation for packaging. I believe the only missing step is to get the linker to accept the additional instructions without making the actual compiler emit the +crypto instructions. |
Hi @vifino, If linux is defined, then the OS is checked for CPU features. Thanks, |
@SparkiDev here are my results with the latest PR #8311 on a Pi4 (this HW does not have crypto extensions). FYI: Works fine with
|
FYI: @SparkiDev is making progress on resolving this. The SHA2/SHA3 runtime CPUID detection was resolved for AARCH64. Next he will work on an AES GCM issue. |
Hey @dgarske, that's awesome news! What's the arch set to in your Pi testing, armv8-a or armv8-a+crypto? Does it work with I'll check it out tomorrow myself. |
Hi @vifino , I compiled it on the Pi4 and it defaults to including crypto extensions ( That is correct about the release. We could not get that work completed in time for the release. You would need to use v5.7.6 + PR 8325. Thanks, |
Hi @vifino , I believe this issue has been fully resolved, but it would be great to get your feedback before closing this issue. Do you have any updates? Thanks, |
Hey @dgarske, apologies for the late response, extraordinarily busy the past few weeks. I've applied PR 8325 on top of 5.7.6 and tested it on x86 and aarch64 hosts with crypto extensions. On the ARM machine, if I The unit test fails in a very... interesting and verbose way: https://gist.github.com/vifino/35f705d2da913c8b0a505d9251147155 Hope this helps! |
Hi @vifino, I've tried to reproduce this by checking out the v5.7.6-stable tag and applying the PR 8345 patch on top and running with QEMU. In the verbose test output I don't see any failure either. Which Aarch64 CPU are you testing on? Sean |
Hey @SparkiDev, the Gist is much longer than GitHub shows by default, you have to view the full file. That is running on an Ampere CPU in the Oracle Cloud -
Hope this helps. |
Hey @SparkiDev. That commit is already in 5.7.6-stable, which I am using, plus PR 8325 on top. |
Hi @vifino , Is the
I'm wondering if this error is not related to the aarch64 with LTO optimizations? Thanks, |
Hi @dgarske! I haven't tested if it fails at the same point, but it always fails when I have the 5.7.6-stable + PR 8325 applied and "--enable-armasm". The issue title is kinda outdated by now, yes, but the CPUID detection for arm64 is the fix for this issue, but as mentioned, I can't get the tests to pass. |
Hi @vifino, I can't think of anything different about the assembly code other than it should be faster and it may use more RAM. In case of a memory issue, could you please try disabling SP, and leaving ARM asm on. Thanks, |
Contact Details
Here, on GitHub, preferrably.
Version
5.7.4
Description
The build fails with error messages like the following on aarch64 Linux machines, which should compile wolfssl for
-mcpu=armv8-a
with LTO. 5.7.2 worked fine../configure flags:
--disable-static --disable-dependency-tracking --prefix=/nix/store/q5adnsfcgdhbcc5siq3c5v8w5d9h19dr-wolfssl-all-5.7.4 --bindir=/nix/store/q5adnsfcgdhbcc5siq3c5v8w5d9h19dr-wolfssl-all-5.7.4/bin --sbindir=/nix/store/q5adnsfcgdhbcc5siq3c5v8w5d9h19dr-wolfssl-all-5.7.4/sbin --includedir=/nix/store/l33016xyxb23f8l4v0hbw7kx086hssff-wolfssl-all-5.7.4-dev/include --oldincludedir=/nix/store/l33016xyxb23f8l4v0hbw7kx086hssff-wolfssl-all-5.7.4-dev/include --mandir=/nix/store/q5adnsfcgdhbcc5siq3c5v8w5d9h19dr-wolfssl-all-5.7.4/share/man --infodir=/nix/store/q5adnsfcgdhbcc5siq3c5v8w5d9h19dr-wolfssl-all-5.7.4/share/info --docdir=/nix/store/fjivg6dwixisnz8vs29d74mkk0ch4f6d-wolfssl-all-5.7.4-doc/share/doc/wolfssl --libdir=/nix/store/bkkhgrpb580lvqgvwr06750pk9j12sjr-wolfssl-all-5.7.4-lib/lib --libexecdir=/nix/store/bkkhgrpb580lvqgvwr06750pk9j12sjr-wolfssl-all-5.7.4-lib/libexec --localedir=/nix/store/bkkhgrpb580lvqgvwr06750pk9j12sjr-wolfssl-all-5.7.4-lib/share/locale --enable-all --enable-reproducible-build --enable-pkcs11 --enable-writedup --enable-base64encode --enable-bigcache --enable-sp=yes,asm --enable-sp-math-all --enable-harden CC=gcc
CFLAGS contain
-fPIC -O2 -U_FORTIFY_SOURCE -Wformat -Wformat-security -Werror=format-security -fzero-call-used-regs=used-gpr -fstack-protector-strong --param ssp-buffer-size=4 -fno-strict-overflow -march=armv8-a
LDFLAGS contain
-flto
Reproduction steps
git clone -b wolfssl-5.7.4 --depth=1 https://github.com/vifino/nixpkgs
nix-build -A wolfssl
Relevant log output
The text was updated successfully, but these errors were encountered: