-
Notifications
You must be signed in to change notification settings - Fork 16
Add interpolation kernels (only supports 2x and align_corners=False) #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general, let's use mgrid instead of nl.arange, and the block dimension for sbuf and psum tensors is deprecated so the block either needs to be moved to the free dimension, or use them as a list of 2d tiles.
h_end_hbm_dst = 2 * h_end_hbm_src | ||
|
||
for p in nl.affine_range(math.ceil(n * c / P_TILE_SIZE)): | ||
out_tile = nl.ndarray([P_TILE_SIZE, h_tile_size_dst, w_dst], dtype=src_arr.dtype, buffer=nl.sbuf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
block dimension on SBUF is being deprecated. Instead use the free dimension for P_TILE_SIZE, or create a list 2d tiles and use them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @JonathanHenson !
Thank you for you feedback! By "block dimension on SBUF is being deprecated", do you mean that nl.ndarray
on SBUF cannot have more (logical) dimensions than the two physical dimensions any longer? Would you have a code snippet of the two alternatives and/or docs that explains what is being deprecated further?
Just trying to adjust my mental model to what is allowed and no longer allowed so we can adjust how we approach the problem. In the current implementation, we start from a 4D NCHW input tensor. The input tensor is reshaped into a 3D one of shape (N.C, H, W). We then tile along both the first and second dimensions, i.e. we operate on tiles (p_tile_size, h_tile_size, W) with the first dimension mapped to the SBUF partition dimension and the remaining two dimensions mapped to the SBUF free dimension.
out_tile = nl.ndarray([P_TILE_SIZE, h_tile_size_dst, w_dst], dtype=src_arr.dtype, buffer=nl.sbuf) | ||
|
||
### Load input array from HBM | ||
i_p = p * P_TILE_SIZE + nl.arange(P_TILE_SIZE)[:, None, None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use mgrid
### Core region | ||
weight_2d = weight_1d**2 | ||
|
||
i_p = nl.arange(P_TILE_SIZE)[:, None, None, None, None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mgrid?
d_start_hbm_dst = 2 * d_start_hbm_src + 1 if d_start_hbm_src else 0 | ||
d_end_hbm_dst = 2 * d_end_hbm_src | ||
for p in nl.affine_range(math.ceil(n * c / P_TILE_SIZE)): | ||
out_tile = nl.ndarray([P_TILE_SIZE, d_tile_size_dst, h_dst, w_dst], dtype=src_arr.dtype, buffer=nl.sbuf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactor P_TILE_SIZE to free_dim or use a list of 2d tiles. block dimension is being deprecated
This PR adds nki kernel samples for bilinear and trilinear interpolations. As title mentions, these kernels only support upsampling to 2x dimensions with align_corners=False.
In collaboration with @plienhar.
Testing:
Please see detailed unit test requirements in the CONTRIBUTING.md
nki.baremetal
nki.benchmark
Pull Request Checklist