-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Add fixed_size_group support to algorithms #9181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add fixed_size_group support to algorithms #9181
Conversation
Enables the following functions to be used with fixed_size_group arguments: - group_barrier - group_broadcast - any_of_group - all_of_group - none_of_group - reduce_over_group - exclusive_scan_over_group - inclusive_scan_over_group Signed-off-by: John Pennycook [email protected]
Co-authored-by: aelovikov-intel <[email protected]>
Co-authored-by: Steffen Larsen <[email protected]>
@steffenlarsen: Do you think Regression/same_unnamed_kernels.cpp might be an unrelated failure? |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
CUDA Failed Tests (1): |
…a. (#9671) This PR adds cuda support for fixed_size_group, ballot_group, and opportunistic_group algorithms. All group algorithm support added for the SPIRV impls (those added in e.g. #9181) is correspondingly added here for the cuda backend. Everything except the reduce/scans uses the same impl for all non-uniform groups. Reduce algorithms also use the same impl for all group types on sm80 for special IsRedux types/ops pairs. Otherwise reduce/scans have two impl categories: 1.fixed_size_group 2.opportunistic_group, ballot_group, (and tangle_group once it is supported) all use the same impls. Note that tangle_group is still not supported. However all algorithms implemented by ballot_group/opportunistic_group will I think be appropriate for tangle_group when it is supported. --------- Signed-off-by: JackAKirk <[email protected]>
…a. (intel#9671) This PR adds cuda support for fixed_size_group, ballot_group, and opportunistic_group algorithms. All group algorithm support added for the SPIRV impls (those added in e.g. intel#9181) is correspondingly added here for the cuda backend. Everything except the reduce/scans uses the same impl for all non-uniform groups. Reduce algorithms also use the same impl for all group types on sm80 for special IsRedux types/ops pairs. Otherwise reduce/scans have two impl categories: 1.fixed_size_group 2.opportunistic_group, ballot_group, (and tangle_group once it is supported) all use the same impls. Note that tangle_group is still not supported. However all algorithms implemented by ballot_group/opportunistic_group will I think be appropriate for tangle_group when it is supported. --------- Signed-off-by: JackAKirk <[email protected]>
Enables the following functions to be used with fixed_size_group arguments:
Signed-off-by: John Pennycook [email protected]