Skip to content

Bgemm for arm64 #5287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: develop
Choose a base branch
from
Draft

Bgemm for arm64 #5287

wants to merge 9 commits into from

Conversation

taoye9
Copy link
Contributor

@taoye9 taoye9 commented May 28, 2025

No description provided.

@taoye9 taoye9 marked this pull request as draft May 28, 2025 13:27
@@ -306,6 +307,13 @@ typedef int blasint;
#define SIZE 8
#define BASE_SHIFT 3
#define ZBASE_SHIFT 4
#elif defined(BFLOAT16_ONLY)
Copy link
Contributor

@annop-w annop-w May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we need to introduce BFLOAT16_ONLY build flag.
The type of FLOAT is the only difference. Can we simplyl use XFLOAT instead of FLOAT for C matrix, for example, in kernel/arm64/bgemm_beta.c ?

@martin-frbg
Copy link
Collaborator

Thanks. Failures on CirrusCI are from running out of compute credits (as we're close to the end of the month), I'll rectify that in a minute. Not sure how DGEMM got hit by your PR on at least RISCV and LOONGARCH64 platforms, haven't looked closely yet but possibly some unintentional shifting of GEMM_UNROLL parameters ? (Also not sure why this touches SBGEMM settings ?)

@@ -169,6 +170,22 @@
#define STOP_RPCC(COUNTER)
#endif

#if defined(HALF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove BUILD_BFLOAT16_ONLY flag, this is not needed I suppose.

@@ -216,6 +216,22 @@ typedef struct {
#define STOP_RPCC(COUNTER)
#endif

#if defined(HALF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. See comment in level3.c.

@@ -56,6 +56,9 @@
#elif defined(BFLOAT16)
#define ERROR_NAME "SBGEMM "
#define GEMV BLASFUNC(sbgemv)
#elif defined(BFLOAT16_ONLY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a different naming ? Perhaps change to SBFLOAT16 for SBGEMM and BFLOAT16 for BGEMM. What do you think ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to split it? BUILD_BFLOAT16 could cover both cases and enable all bfloat16 ops?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mousius there's a slight difference between the global BUILD_BFLOAT16 and this BFLOAT16 (that IIRC is local to the interface and gets added by the (C)Makefile ). Maybe we could have a "BGEMM" define that gets set by the Makefile specifically when the function is bgemm and that is processed inside the "elif defined(BFLOAT16)" ? I haven't looked into the assumed requirement for a BUILD_BFLOAT16_ONLY in the level3 driver yet, but my gut feeling is that a local variable in interface/Makefile (and eventually CMakeLists.txt) to modify gemm.c build behaviour for bgemm.o should be sufficient.

@@ -660,15 +687,15 @@ $(KDIR)$(SBGEMMONCOPYOBJ) : $(KERNELDIR)/$(SBGEMMONCOPY)
$(KDIR)$(SBGEMMOTCOPYOBJ) : $(KERNELDIR)/$(SBGEMMOTCOPY)
$(CC) $(CFLAGS) -c -DBFLOAT16 -UDOUBLE -UCOMPLEX $< -o $@

ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N))
#ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out ?


int CNAME(BLASLONG m, BLASLONG n, BLASLONG dummy1, FLOAT beta, IFLOAT *dummy2,
BLASLONG dummy3, IFLOAT *dummy4, BLASLONG dummy5, FLOAT *c,
BLASLONG ldc) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change FLOAT to IFLOAT or XFLOAT ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants