Skip to content

[SYCL][NATIVECPU] Add device library and initial subgroup support #13979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 151 commits into from
Jul 9, 2024

Conversation

uwedolinsky
Copy link
Contributor

@uwedolinsky uwedolinsky commented May 31, 2024

This PR implements SYCL NativeCPU runtime functions as C++ functions in a new native_cpu device library instead of materializing them by LLVM passes. This library also contains native_cpu implementations for many SYCL builtins, including for subgroup support. The PR will make at least the following e2e tests pass:

SubGroup/barrier.cpp
SubGroup/broadcast.cpp
SubGroup/broadcast_fp64.cpp
SubGroup/common.cpp
SubGroup/generic-shuffle.cpp
SubGroup/shuffle_fp64.cpp
SubGroup/sub_group_as.cpp
SubGroup/sub_group_as_vec.cpp
SubGroup/sub_group_by_value_semantics.cpp
SubGroup/sub_groups_sycl2020.cpp

Other tests are currently skipped as the NativeCPU UR adapter does not yet report the new capabilities, which will be updated in a subsequent PR.

@uwedolinsky uwedolinsky requested a review from a team as a code owner May 31, 2024 09:04
@uwedolinsky uwedolinsky requested a review from dm-vodopyanov May 31, 2024 09:04
@uwedolinsky uwedolinsky marked this pull request as draft May 31, 2024 09:05
@uwedolinsky uwedolinsky marked this pull request as ready for review June 4, 2024 08:39
((TC->getAuxTriple()->isOSWindows()
? "remangled-l32-signed_char.libspirv-"
: "remangled-l64-signed_char.libspirv-") +
TC->getTripleString() + ".bc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the target triple for NVPTX is expected to be nvptx64-nvidia-cuda, so we should be able to use the TC->getTripleString() for all cases. If that's not the case, maybe use a variable to hold the triple string to use instead of this cascading value assignment for readibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using TC->getTripleString() for all cases would change the current behavior because Triple::isNVPTX() also returns true for target triple nvptx-nvidia-cuda. I'm not sure if this triple is meant to be supported but it's not rejected. Both, nvptx and nvptx64 currently select the same libclc (for nvptx64), which may be a separate issue. But to keep this behavior I've created a variable to hold the target triple as you suggested. Is that change acceptable?

Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for driver - thanks!

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests LGTM!

@@ -1,5 +1,8 @@
// TODO: move this to clang/Driver once Native CPU is enabled in CI
// REQUIRES: native_cpu
// TODO: currently disabled on Windows, but could work on Windows if Linux
// libraries are found
// UNSUPPORTED: (windows)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For good measure, could you please open a Github issue and add a link in a comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done! #14500

Copy link
Contributor

@PietroGhg PietroGhg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@sommerlukas sommerlukas merged commit 17ee3e2 into intel:sycl Jul 9, 2024
14 checks passed
sommerlukas pushed a commit that referenced this pull request Jul 10, 2024
This fixes the failure in the nightly
(https://github.com/intel/llvm/actions/runs/9867495007/job/27247963704)
introduced by #13979 by adding a mock
file for the Native CPU utils lib, and using the `--sysroot` option so
that the driver picks it up, similarly to how it's done for e.g.
[sycl-device-lib.cpp](https://github.com/intel/llvm/blob/sycl/clang/test/Driver/sycl-device-lib.cpp).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants