-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Bgemm for arm64 #5287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Bgemm for arm64 #5287
Conversation
@@ -306,6 +307,13 @@ typedef int blasint; | |||
#define SIZE 8 | |||
#define BASE_SHIFT 3 | |||
#define ZBASE_SHIFT 4 | |||
#elif defined(BFLOAT16_ONLY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need to introduce BFLOAT16_ONLY build flag.
The type of FLOAT is the only difference. Can we simplyl use XFLOAT instead of FLOAT for C matrix, for example, in kernel/arm64/bgemm_beta.c ?
Thanks. Failures on CirrusCI are from running out of compute credits (as we're close to the end of the month), I'll rectify that in a minute. Not sure how DGEMM got hit by your PR on at least RISCV and LOONGARCH64 platforms, haven't looked closely yet but possibly some unintentional shifting of GEMM_UNROLL parameters ? (Also not sure why this touches SBGEMM settings ?) |
@@ -169,6 +170,22 @@ | |||
#define STOP_RPCC(COUNTER) | |||
#endif | |||
|
|||
#if defined(HALF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove BUILD_BFLOAT16_ONLY flag, this is not needed I suppose.
@@ -216,6 +216,22 @@ typedef struct { | |||
#define STOP_RPCC(COUNTER) | |||
#endif | |||
|
|||
#if defined(HALF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. See comment in level3.c.
@@ -56,6 +56,9 @@ | |||
#elif defined(BFLOAT16) | |||
#define ERROR_NAME "SBGEMM " | |||
#define GEMV BLASFUNC(sbgemv) | |||
#elif defined(BFLOAT16_ONLY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a different naming ? Perhaps change to SBFLOAT16 for SBGEMM and BFLOAT16 for BGEMM. What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need to split it? BUILD_BFLOAT16
could cover both cases and enable all bfloat16 ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mousius there's a slight difference between the global BUILD_BFLOAT16 and this BFLOAT16 (that IIRC is local to the interface and gets added by the (C)Makefile ). Maybe we could have a "BGEMM" define that gets set by the Makefile specifically when the function is bgemm and that is processed inside the "elif defined(BFLOAT16)" ? I haven't looked into the assumed requirement for a BUILD_BFLOAT16_ONLY in the level3 driver yet, but my gut feeling is that a local variable in interface/Makefile (and eventually CMakeLists.txt) to modify gemm.c build behaviour for bgemm.o should be sufficient.
@@ -660,15 +687,15 @@ $(KDIR)$(SBGEMMONCOPYOBJ) : $(KERNELDIR)/$(SBGEMMONCOPY) | |||
$(KDIR)$(SBGEMMOTCOPYOBJ) : $(KERNELDIR)/$(SBGEMMOTCOPY) | |||
$(CC) $(CFLAGS) -c -DBFLOAT16 -UDOUBLE -UCOMPLEX $< -o $@ | |||
|
|||
ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N)) | |||
#ifneq ($(SBGEMM_UNROLL_M), $(SBGEMM_UNROLL_N)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this commented out ?
kernel/arm64/bgemm_beta.c
Outdated
|
||
int CNAME(BLASLONG m, BLASLONG n, BLASLONG dummy1, FLOAT beta, IFLOAT *dummy2, | ||
BLASLONG dummy3, IFLOAT *dummy4, BLASLONG dummy5, FLOAT *c, | ||
BLASLONG ldc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change FLOAT to IFLOAT or XFLOAT ?
…d makefiles changes for bgemm interface
No description provided.