Add lowering for `insert_slice`-like `scatter` ops (KV-cache) #2771

Wheest · 2025-04-08T10:01:49Z

Coming from #1758, this PR adds a lowering for a narrow case of stablehlo.scatter, namely those that are equivalent to a tensor.insert_slice.

This is a common case, since it's how KV-cache updates are modelled when exporting from PyTorch.

The code is adapted from an implementation in the catalyst compiler by @erick-xanadu, expanded to cover the cases I saw coming out of PyTorch.

The conversion produces tensor.insert_slice ops, rather than linalg. This may or may not be acceptable, but I'm putting the PR up first to get thoughts.

I don't believe there's an exact equivalent to insert_slice in linalg, but I think it could be achieved with a linalg.generic. However, since the StableHLO conversion inserts a bunch of tensor ops anyway, I don't think lowering to tensor.insert_slice directly is against the spirit of what already exists.

All other cases of scatter will be left as-is, but given it's quite a complex op, this pattern will provide near-term utility while a more general purpose lowering is cooked up.

Note to reviewers, this is my first contribution to the project, so there may some workflow things I've missed.

feat: add scatter lowering for insert_slice case

9c6e444

Wheest marked this pull request as ready for review April 8, 2025 10:03

Wheest changed the title ~~Add lowering for insert_slice-like scatter ops (KV-cache) #310~~ Add lowering for insert_slice-like scatter ops (KV-cache) Apr 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lowering for `insert_slice`-like `scatter` ops (KV-cache) #2771

Add lowering for `insert_slice`-like `scatter` ops (KV-cache) #2771

Wheest commented Apr 8, 2025

Add lowering for insert_slice-like scatter ops (KV-cache) #2771

Are you sure you want to change the base?

Add lowering for insert_slice-like scatter ops (KV-cache) #2771

Conversation

Wheest commented Apr 8, 2025

Add lowering for `insert_slice`-like `scatter` ops (KV-cache) #2771

Add lowering for `insert_slice`-like `scatter` ops (KV-cache) #2771