You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Flipped the nesting of the loops to improve memory locality.
This ensures that the tight inner loop just writes to a buffer for a single
set. Otherwise, we need to jump around to write to buffers for different sets.
0 commit comments