-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][NATIVECPU] Add device library and initial subgroup support #13979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
clang/lib/Driver/Driver.cpp
Outdated
((TC->getAuxTriple()->isOSWindows() | ||
? "remangled-l32-signed_char.libspirv-" | ||
: "remangled-l64-signed_char.libspirv-") + | ||
TC->getTripleString() + ".bc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the target triple for NVPTX is expected to be nvptx64-nvidia-cuda
, so we should be able to use the TC->getTripleString()
for all cases. If that's not the case, maybe use a variable to hold the triple string to use instead of this cascading value assignment for readibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using TC->getTripleString()
for all cases would change the current behavior because Triple::isNVPTX()
also returns true for target triple nvptx-nvidia-cuda
. I'm not sure if this triple is meant to be supported but it's not rejected. Both, nvptx
and nvptx64
currently select the same libclc (for nvptx64), which may be a separate issue. But to keep this behavior I've created a variable to hold the target triple as you suggested. Is that change acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for driver - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests LGTM!
@@ -1,5 +1,8 @@ | |||
// TODO: move this to clang/Driver once Native CPU is enabled in CI | |||
// REQUIRES: native_cpu | |||
// TODO: currently disabled on Windows, but could work on Windows if Linux | |||
// libraries are found | |||
// UNSUPPORTED: (windows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For good measure, could you please open a Github issue and add a link in a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! #14500
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
This fixes the failure in the nightly (https://github.com/intel/llvm/actions/runs/9867495007/job/27247963704) introduced by #13979 by adding a mock file for the Native CPU utils lib, and using the `--sysroot` option so that the driver picks it up, similarly to how it's done for e.g. [sycl-device-lib.cpp](https://github.com/intel/llvm/blob/sycl/clang/test/Driver/sycl-device-lib.cpp).
This PR implements SYCL NativeCPU runtime functions as C++ functions in a new native_cpu device library instead of materializing them by LLVM passes. This library also contains native_cpu implementations for many SYCL builtins, including for subgroup support. The PR will make at least the following e2e tests pass:
Other tests are currently skipped as the NativeCPU UR adapter does not yet report the new capabilities, which will be updated in a subsequent PR.