WIP: Add LBR_GRU to cell_fusion kernel for MTL dispatch #2868

h-sadia · 2025-03-12T19:50:34Z

Description

This WIP PR has a few things left:

Figure out correction issues for LBR GRU
Consolidate common code between cell_common and cell_gru_lbr and put it in a common header file
Consolidate lbr gru code in the kernel and create a function to remove duplicate code
Ideally a new lcm function can be created for k_limit
Fixes # (MFDNN-12712)

rjoursler · 2025-03-12T20:02:57Z

src/gpu/intel/ocl/rnn/rnn_utils.cpp

+        int eu_count = device_info.eu_count();
+        int ideal_k_block = graph::utils::lcm(
+                eu_count, (int)device_info.min_subgroup_size());
+        int ideal_k_limit = graph::utils::lcm(ideal_k_block, (int)rnn.sic);


Uses math_utils::lcm

rjoursler · 2025-03-12T20:05:01Z

src/gpu/intel/ocl/rnn/rnn_grid.cl

-                        ctx.rnn.bias[off_ker_bias(dims.dhc, 0, c)],
-                        ctx.rnn.alpha, ctx.rnn.tm_scales);
-                store_vanilla_rnn(gates.ptr, gates.strides.mb, states.ptr,
-                        states.strides.mb, dims.dhc, n, c, g);


Please keep the runtime if statements so that we get compiler tests on all cases during development.

rjoursler · 2025-03-12T20:06:42Z

src/gpu/intel/ocl/rnn/rnn_grid.cl

+                            * TO_REF(
+                                    ctx.lbr_gru.hidden_state_iter[cell_ws_state(
+                                            states.strides.mb, n, c)])
+                    + (1 - G0) * G2;


Please wrap the Ht calculation into compute_gates_lbr_gru., we can then just get rid of all these intermediate variables.

Additionally, please update elemwise_fwd to use compute_gates_lbr_gru to avoid code duplication.

rjoursler · 2025-03-12T20:12:39Z

src/gpu/intel/ocl/rnn/cell_gru_lbr.cpp

+}
+
+status_t compute_cell_fwd(const exec_ctx_t &ctx,
+        const compute::kernel_t &kernel, dim_t lay, dim_t dir, dim_t iter,


Rather than duplicate this function, can we reasonably create a shared implementation between cell_common and this one.?

h-sadia added 3 commits March 12, 2025 12:38

intel: ocl: rnn: initial k dispatch logic

9e58cfb

intel: ocl: rnn: add lbr_gru algorithm to cell_common

c94ca30

intel: ocl: rnn: add use_cell function call to cell_gru_lbr

cb584d4

h-sadia requested a review from a team as a code owner March 12, 2025 19:50

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Mar 12, 2025

rjoursler reviewed Mar 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Add LBR_GRU to cell_fusion kernel for MTL dispatch #2868

WIP: Add LBR_GRU to cell_fusion kernel for MTL dispatch #2868

h-sadia commented Mar 12, 2025 •

edited

Loading

rjoursler Mar 12, 2025

rjoursler Mar 12, 2025

rjoursler Mar 12, 2025 •

edited

Loading

rjoursler Mar 12, 2025

WIP: Add LBR_GRU to cell_fusion kernel for MTL dispatch #2868

Are you sure you want to change the base?

WIP: Add LBR_GRU to cell_fusion kernel for MTL dispatch #2868

Conversation

h-sadia commented Mar 12, 2025 • edited Loading

Description

rjoursler Mar 12, 2025

Choose a reason for hiding this comment

rjoursler Mar 12, 2025

Choose a reason for hiding this comment

rjoursler Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

rjoursler Mar 12, 2025

Choose a reason for hiding this comment

h-sadia commented Mar 12, 2025 •

edited

Loading

rjoursler Mar 12, 2025 •

edited

Loading