Skip to content

Add support for FP16 to openBLAS and shgemm on RISCV #5290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

Srangrang
Copy link

-add HFLOAT16 and BUILD_HFLOAT16 macro define to distinguish BFLOAT16 and BUILD_BFLOAT16
-add shgemm for RISCV_ZVL128B and RISCV_ZVL256B
-using fp16 on RISCV requires zfh and zvfh instruction sets
-enable fp16 support in Makefile.rule

Related to issue #5279
Co-authored-by Ao Dong

Srangrang and others added 6 commits May 24, 2025 23:55
…r RISCV64_ZVL256B

Added HFLOAT16 support for RISCV64
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B based on HFLOAT16
The instruction sets used are ZVFH and ZFH, which need to be supported by RVV1.0

Related to issue OpenMathLib#5279
Co-authored-by Linjin Li <[email protected]>
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 fo…
- modify the macro conditions in Makefile.system
- Delete development test code

Related to issue#5279
@martin-frbg martin-frbg added this to the 0.3.31 milestone Jun 11, 2025
@martin-frbg
Copy link
Collaborator

I think we'll need some adjustments in interface/gemm.c to disable the "small matrix" pathway, adjust the function name used in error messages, and also keep hfloat16 out of the optimum thread number computations, now that it is separate from bfloat16. Please see attached (and add this or similar to the PR if you agree)
gemm.c.txt

Also it probably makes sense to seperate the two types to such an extent that one can be built without the other - I have further modified your version of gensymbol(.pl) to take another parameter for the BUILD_HFLOAT16, and adjusted exports/Makefile accordingly:
Makefile.txt
gensymbol.pl.txt
gensymbol.txt

@martin-frbg
Copy link
Collaborator

In similar vein, the affected files in the benchmark folder probably need adjusting as well, please check
Makefile.txt

gemm.c.txt

@martin-frbg
Copy link
Collaborator

Lastly, I found that I cannot build the "generic" replacements on a non-RISCV host unless I remove the ifneq ($(SHGEMM_UNROLL_M), $(SHGEMM_UNROLL_N)) conditional around the compiler commands for the SHGEMMINCOPYOBJ and SHGEMMITCOPYOBJ in kernel/Makefile.L3 (and I wonder if it would make sense to provide hfloat16 conversion functions in kernel/generic/gemm_kernel2x2.c similar to what is present there for bfloat16) ?

Unfortunately, I have not had the time to try a CMake build yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants