-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Add esimd device descriptor for 2d load/store/prefetch #15905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b7ba417
to
a3bd516
Compare
a3bd516
to
6aac1e2
Compare
Am I right in thinking that this feature (2D block loads/stores) is not actually exposed to SYCL? |
It is indirectly available to users through the joint_matrix abstraction. But for that I don't think the user needs to be able to query it. Currently it isn't available directly in a supported way (cutlass uses IGC builtins). But we plan to support it at least through SPIR-V/OpenCL interop and if there demand also as SYCL extension. |
It does seem weird to add a SYCL query for a feature that cannot be used via SYCL. Can we instead add the query at the same level as the API? For example, if applications will access 2D block load/store via OpenCL, can we add an OpenCL query? |
Thanks for the comments, the request was from the user to add this query at the SYCL level. I will get more details about how they use 2D block load/store feature and if they can consider using L0 API directly. |
6aac1e2
to
37044fc
Compare
37044fc
to
eceb3b7
Compare
According to the info from the user they use either esimd or joint_matrix. In case of ESIMD it looks like load_2d/store_2d operations are mapped to this low level feature https://github.com/intel/compute-runtime/blob/24.39.31294.12/level_zero/doc/experimental_extensions/2D_BLOCK_TRANSPOSE.md. Currently in the ESIMD docs there is a statement that load_2d/store_2d can be used only on PVC (see https://github.com/intel/llvm/pull/15905/files#diff-fef71b0090d0ea44c96240cd9d6497881b617b91e234635a19313eb69864b7f3L555): Does it make sense to introduce such aspect? I've updated the PR with proposed changes. |
I can't say for sure if the architecture restriction is correct (idk if 2d ops are PVC only, or actually PVC or later or what). |
d671fcb
to
cc522a6
Compare
cc522a6
to
08bbd1c
Compare
dyn_cgf_different_arg_nums.cpp failure on CUDA is unrelated and is fixed here: #16157 |
@intel/unified-runtime-reviewers Could you please approve. |
@againull, FYI, this PR breaks some tests on PVC:
|
Thanks, those have requires: pvc, so I didn't get failures in pre/post-commit. I know the rootcause, will fix. |
Add esimd device descriptor to check if 2d block operations are supported by the device.
UR counterpart: oneapi-src/unified-runtime#2261