How to decide a proper shape for the `tmp` buffer in `T.tile.reduce_xxx`?

```python
T.tile.reduce_max(out: Buffer, buffer: Buffer, tmp: Buffer, dim: int)
```
`T.tile.reduce_xxx` primitives require to pass a `tmp` buffer to function properly.

Currently, shape allocation of `tmp` varies across different examples. It seems that the appropriate shape of `tmp` is indeterminate.

For example:
In [`examples/normalization/layer_norm.py`](https://github.com/tile-ai/tilelang-ascend/blob/ascendc_pto/examples/normalization/layer_norm.py#L31)
```python
tmp_ub = T.alloc_ub([3 * DataType(dtype).bits // 8 * block_M // VEC_NUM * block_N], "uint8")
```

In [`examples/softmax/example_online_softmax.py`](https://github.com/tile-ai/tilelang-ascend/blob/ascendc_pto/examples/softmax/example_online_softmax.py#L46)
```python
tmp = T.alloc_ub([2 * sub_block_M * block_N], "uint8")
```

In [`examples/lightning_indexer/example_lightning_indexer.py`](https://github.com/tile-ai/tilelang-ascend/blob/ascendc_pto/examples/lightning_indexer/example_lightning_indexer.py#L75)
```python
mm_res_ub_uint8 = T.alloc_ub((VECTOR_BASEG, VECTOR_BASEN), "uint8")
```

In AscendC, they provide the [`GetReduceSumMaxMinTmpSize`](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1/API/ascendcopapi/atlasascendc_api_07_10160.html) API for estimating the size to be allocated for `AscendC::ReduceXXX` APIs.
However, currently this API is not accessible in Tilelang-Ascend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to decide a proper shape for the `tmp` buffer in `T.tile.reduce_xxx`? #150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to decide a proper shape for the tmp buffer in T.tile.reduce_xxx? #150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

How to decide a proper shape for the `tmp` buffer in `T.tile.reduce_xxx`? #150