-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dataflow] Packed systolic matmul test caes #264
base: main
Are you sure you want to change the base?
Conversation
@jiahanxie353 Thanks for contributing! Please follow this guideline to format your code. |
thanks for getting back @chhzh123 ! |
thanks! I believe I have passed all the code-format testing any advice on fixing the issue? I wonder why I can't index into an |
Maybe you can try separate the memory access -- first load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you test a larger systolic array? I tested your code with L, D = 4, 4
-- the effective SA size of 2x2, and it failed
import numpy as np | ||
|
||
L, D = 2, 2 | ||
M, N, K = L, 1 * D, D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is 1
here? Maybe create a constant variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed L, D
and only kept M, N, K
for clearance. Please let me know if you prefer to keep L, D
if PP == 2: | ||
np_type = np.int16 | ||
allo_type = int16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add PP=4, 8 cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I added PP=4
.
Now I'm hard-coding each basic element size is int8
, which forces allo_type == int8 * PP
. I will make it parametric and add different PP
-allo_type
combinations
6f9b7ee
to
680f364
Compare
with allo.meta_elif(j == 0): | ||
# i > 0 | ||
for k in range(K): | ||
fifo_A[i, j + 1].put(X_packed[(i - 1) * PP, k]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line looks suspicious.
X_packed
has type allo_type[M, K // PP]
. And for example, when M, N, K = 4, 4, 2
, X_packed
type is allo_type[4, 1]
. And if we have a for-loop: for k in range(2)
. Suppose k = 1
, not only it didn't go out of bound when accessing X_packed
array, the results are all correct...
maybe we should look into this...
Description
This PR tries to add a test case for packed systolic using the Dataflow module
Proposed Solutions
I have added a test case called
tests/dataflow/test_packed_systolic.py
; but I'm still trying to figure out the indexing problems on this line, which raises:Examples
Example is shown in the
tests/dataflow/test_packed_systolic.py
. Is there any obvious error in the example?Checklist