-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[STF] Implement a reduce algorithm over CUB #3122
base: main
Are you sure you want to change the base?
Conversation
…s a low level approach so that we can improve the existing one
/ok to test |
/ok to test |
@@ -602,7 +602,7 @@ public: | |||
} | |||
} | |||
|
|||
private: | |||
// private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😲
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed this is a temporary hack, we need to get rid of that to make the implementation compatible with CUDA graphs anyway.
// This will be the ultimate result of the transform followed by reduce | ||
auto result = ctx.logical_data(shape_of<scalar_view<OutT>>()); | ||
// Create a typed task eagerly that we'll use throughout | ||
auto t = ctx.task(result.write(), args.read()...); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to move this creation of the task in the inner most lambda
/ok to test |
Description
This explores how we can implement C++ algorithms over CUDASTF by the means of CUB, starting from the reduce algorithm
Checklist