Create MaxPool2D.py #67

jimburtoft · 2025-04-09T18:52:31Z

To replace the MaxPool2D function. Not sure if it is faster than a traced pytorch version or not.

However, it does show an interesting use of masking to avoid extra memory writes. (instead of padding with -inf rows and columns on every edge, I just adjust my indices and mask the values for the columns I didn't insert).

All tests are included in the code.

Testing:

Please see detailed unit test requirements in the CONTRIBUTING.md

[ x] The change is covered by numeric check using nki.baremetal
[ x] The change is covered by performance benchmark test using nki.benchmark
The change is covered by end-to-end integration test

Pull Request Checklist

[ x] I have filled in all the required field in the template
[ x] I have tested locally that all the tests pass
[ x] By submitting this pull request, I confirm that my contribution is made under the terms of the MIT-0 license.

New NKI kernel!

JonathanHenson · 2025-04-16T16:39:53Z

contributed/MaxPool2D.py

+    sz_cin, sz_hin, sz_win = in_tensor.shape
+    sz_hout = (sz_hin + 2*padding - kernel_size) // stride + 1
+    sz_wout = (sz_win + 2*padding - kernel_size) // stride + 1
+


let's add assertions on expectations for the shape and parameter values here.

JonathanHenson · 2025-04-16T16:40:16Z

contributed/MaxPool2D.py

+    sz_p = sz_cin
+
+    # Generate pool index patterns with stride
+    i0 = nl.arange(sz_p)[:, None, None, None, None]  # Channel dim


let's use mgrid for this

JonathanHenson · 2025-04-16T16:42:30Z

contributed/MaxPool2D.py

+    i4 = nl.arange(kernel_size)[None, None, None, None, :]  # Pool width
+
+    # Load input data
+    in_tile: tensor[sz_p, sz_hin, sz_win] = nl.load(in_tensor)


per the docs here, your partition dimension must be the first dimension. These should be 2d tiles. We're deprecating block dimension on SBUF.

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/nki/api/generated/nki.language.load.html#nki.language.load

@JonathanHenson The tiling was based on the average_pool2D example.

nki-samples/src/nki_samples/tutorials/average_pool2d/average_pool2d_nki_kernels.py

Line 41 in 3c5d277

in_tile: tensor[sz_p, sz_hin, sz_win] = nl.load(in_tensor)

Will that be updated any time soon?

jimburtoft and others added 2 commits April 9, 2025 14:49

Create MaxPool2D.py

42bfda4

New NKI kernel!

Merge branch 'main' into patch-2

1eeb0af

JonathanHenson reviewed Apr 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create MaxPool2D.py #67

Create MaxPool2D.py #67

Uh oh!

jimburtoft commented Apr 9, 2025

Uh oh!

JonathanHenson Apr 16, 2025

Uh oh!

JonathanHenson Apr 16, 2025

Uh oh!

JonathanHenson Apr 16, 2025

Uh oh!

jimburtoft May 14, 2025

Uh oh!

Uh oh!

Create MaxPool2D.py #67

Are you sure you want to change the base?

Create MaxPool2D.py #67

Uh oh!

Conversation

jimburtoft commented Apr 9, 2025

Testing:

Pull Request Checklist

Uh oh!

JonathanHenson Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

JonathanHenson Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

JonathanHenson Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

jimburtoft May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!