-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lane sequencer behavior error with unbalanced workloads #248
Comments
Same doubt when I read the code. Maybe here code of assigning vstart should be changed to vfu_operation_d.vstart = pe_req.vstart / NrLanes;
// If lane_id_i < vstart % NrLanes, this lane needs to execute one micro-operation less.
if (lane_id_i < pe_req.vstart[idx_width(NrLanes)-1:0]) vfu_operation_d.vstart += 1; alter |
Hello @ckf104 , Regarding the specific code block (lane sequencer line 270) that you mentioned, I'm not entirely sure why it uses To briefly explain the current setup: vstart is set to 16'h41, which is equivalent to 10'd65. The overall Here, However, my confusion lies in this theoretical situation: when In any case, regardless of the design philosophy, it might be advisable to include additional conditional checks to prevent overflow or underflow when calculating values at upper and lower bounds. This is because once an overflow or underflow occurs, the actual bank where values are fetched may deviate significantly from the originally intended bank. Supplementary illustration(For my previous change to In the above figure, you can see that if you change to use |
Hi, @WEIhabi .
which is expected results.
which is also results we expected. Given vl=2046 and vstart=2045, ara should take only one one element for calculation, which will be dispatched to lane 1, and lane 0, 2, 3 don't do any calculation, hence overflow of addr calculation isn't a problem. I think the true problem is how |
Hello @ckf104 . For the But there may be other ways to modify this :)
Thank you for this, it's a wake up call! |
Issue
As mentioned in #237, the relationship between
vl
andvstart
may not be clearly defined in the dispatcher, so we limitvstart
to be smaller thanvl
(the maximumvstart
value can only be up tovlmax-1
), but we can still see the following error scene.The main reason is that when the main sequencer is assigned to the lane sequencer on each lane, the handling of the workload imbalance may not be considered clearly, so that there is an underflow situation, and the value in the wrong bank is fetched.
For example, if the current
vadd.vv
instruction is configured withvl=6
andvstart=1
by the previous vector CSR register, the lane sequencer on each lane will first calculate thevstart
(vfu_operation_d.vstart)=pe_req.vstart/NrLanes=1/4=0, it will be 0 in all lanes, but in lane0, it will reduce the vstart value by 1 if lane_id<vstart%NrLanes, so the vstart value in lane0 will be reduced by 1 from 0 to the maximum negative value (It will be 12'hfff when VLEN is set to 2048).If there's anything I'm considering wrong, please let me know, thanks!
Supplementary Pictures
The text was updated successfully, but these errors were encountered: