-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide generic and safe C++ interfaces for warp shuffle: Issue #2976 #3210
base: main
Are you sure you want to change the base?
Conversation
thanks for the contribution, @soumikiith. I have a couple of initial comments.
|
I updated #2976 to better formalize the features and checks of these functions |
One Question: While computing laneid, can I use modulo operator ? Or is the preferable way to fetch it directly from assembly using asm instructions? Note that my doubt is only in the context of shfl_up and shfl_down. Also, why does a mask value need to be passed (I know that the default value is assigned) in shfl_xor? Is not passing lanemask sufficient ? |
you can use C++ API for PTX, see https://nvidia.github.io/cccl/libcudacxx/ptx/instructions/special_registers.html#laneid
Referring to the official documentation, |
Hi, I have added the checks (I need to fix the assertion statements, though). Please check them and let me know if this is meeting your expected requirements. I will soon commit the casting of different data types using Please let me know of any additional requirements. |
Hi, Merry Christmas !! |
Description
closes #2976
I have provided generic and safe C++ interface for warp shuffle (shuffle_sync only for now). The safety features include: (1) checking for allowable data types, (2) handling of variables that consists of 4 bytes (32 bits).
Soon, I will post the feature to handle 16 bit and 64 bit data types.
Provide generic and safe C++ interfaces for warp shuffle: Issue #2976
Checklist