Skip to content

[SYCL][Fusion] API for kernel fusion library #7465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 5, 2022

Conversation

sommerlukas
Copy link
Contributor

This is the second patch in a series of patches to add an implementation of thekernel fusion extension. We have split the implementation into multiple patches to make them more easy to review.

This patch can be reviewed and merged independently of #7416.

This patch adds the first components for the JIT compiler used for implementation of kernel fusion at runtime , concretely:

  • API definitions
  • Input translation from SPIR-V to LLVM IR
  • Insertion of fused kernel function stub and metadata
  • Supporting infrastructure such as compiler options.

CMake logic to link and code to invoke this JIT from the SYCL runtime will follow in a later patch.

Co-authored-by: Lukas Sommer [email protected]
Co-authored-by: Victor Perez [email protected]
Signed-off-by: Lukas Sommer [email protected]

@sommerlukas sommerlukas force-pushed the kernel-fusion/second-patch branch from be9e0fa to 0ff42c4 Compare November 21, 2022 13:04
@sommerlukas sommerlukas requested a review from bader November 21, 2022 13:04
@sommerlukas
Copy link
Contributor Author

The design document for the overall implementation of kernel fusion can be found here.

@sommerlukas sommerlukas force-pushed the kernel-fusion/second-patch branch 2 times, most recently from deebaf4 to 523c851 Compare November 25, 2022 09:20
@sommerlukas sommerlukas self-assigned this Nov 28, 2022
@Naghasan
Copy link
Contributor

@bader @pvchupin Can anyone help with the review here ?

@pvchupin
Copy link
Contributor

Since this is a new library we may need to determine who would be the best reviewers going forward for these. I'm going to add a few people.
@sommerlukas, can you please also update CODEOWNERS with this change (adding Codeplay folks)?

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple initial comments. I think I'll need a good cup of coffee for this one, so I will return in a couple of days.

///
/// Enumerate possible kinds of parameters.
/// 1:1 correspondence with the definition in kernel_desc.hpp in the DPC++ SYCL
/// runtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to stay in sync with the definition in kernel_desc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, only Accessor (for internalization) and StdLayout (for constant propagation) receive special treatment by the JIT compilation process. The interface logic to this library in the SYCL runtime (part of a later PR) uses the SYCL RT internal ArgDesc directly, so additional members for this enum in kernel_desc would not cause errors in the JIT.

Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a brief review, I haven't been able to grasp it in details yet

if (KernelFunction->hasMetadata(REQD_WORK_GROUP_SIZE_ATTR)) {
auto *MD = KernelFunction->getMetadata(REQD_WORK_GROUP_SIZE_ATTR);
SYCLKernelAttribute ReqdAttr{REQD_WORK_GROUP_SIZE_ATTR};
getAttributeValues(ReqdAttr.Values, MD, 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that there is an ongoing effort to allow less than 3 operands in reqd_work_group_size metadata, see #7450 (@steffenlarsen can link PRs to other components)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, I wasn't aware of that. We should be able to adapt our processing of the attributes once that PR is merged, to match the new behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this particular case KhronosGroup/SPIRV-LLVM-Translator#1726 is also relevant. Note that in the reverse-translator case you will always get 3 operands back, but I believe you also care about generated LLVM IR that hasn't gone through the translator, in which case you may need to adjust for optional parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint!

Right now, we only care about consuming SPIR-V. Untranslated LLVM IR will become relevant for targets that the DPC++ frontend currently does not compile to SPIR-V, e.g., targeting CUDA or AMD GPUs. Will keep this in mind for when we work on that.

First components for the JIT compiler used for SYCL kernel fusion:
* API definitions
* Input translation from SPIR-V to LLVM IR
* Insertion of fused kernel function stub and metadata
* Supporting infrastructure such as compiler options.

Co-authored-by: Lukas Sommer <[email protected]>
Co-authored-by: Victor Perez <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
@sommerlukas sommerlukas force-pushed the kernel-fusion/second-patch branch from 523c851 to 962c922 Compare November 30, 2022 14:10
@sommerlukas
Copy link
Contributor Author

Thank you for the first reviews @steffenlarsen and @AlexeySachkov!

Since this is a new library we may need to determine who would be the best reviewers going forward for these. I'm going to add a few people. @sommerlukas, can you please also update CODEOWNERS with this change (adding Codeplay folks)?

@pvchupin: I've added the Codeplay team that has been working on this implementation/library to CODEOWNERS. Let me know if I should add more people/teams.

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more comments, but I think it looks good overall.

Note that the comment about the use of auto should probably also apply to a few other files. It doesn't have to be all uses of auto, for example X::create should be clear enough about the type being X, but for those that aren't necessarily clear about exactly what type it uses would be nice to have explicit.

Also, what are the plans for testing this? I realize much of this is just the skeleton with more to come, but is there any of this that might make sense to unit-test now or does it make more sense to wait?

if (KernelFunction->hasMetadata(REQD_WORK_GROUP_SIZE_ATTR)) {
auto *MD = KernelFunction->getMetadata(REQD_WORK_GROUP_SIZE_ATTR);
SYCLKernelAttribute ReqdAttr{REQD_WORK_GROUP_SIZE_ATTR};
getAttributeValues(ReqdAttr.Values, MD, 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this particular case KhronosGroup/SPIRV-LLVM-Translator#1726 is also relevant. Note that in the reverse-translator case you will always get 3 operands back, but I believe you also care about generated LLVM IR that hasn't gone through the translator, in which case you may need to adjust for optional parameters.

@sommerlukas
Copy link
Contributor Author

Thanks again for the feedback @steffenlarsen.

Also, what are the plans for testing this? I realize much of this is just the skeleton with more to come, but is there any of this that might make sense to unit-test now or does it make more sense to wait?

The sycl-fusion library will be tested in two ways:

  1. Through llvm-lit based unit tests for the different passes that are part of the library (but not yet part of this PR). These tests also test the metadata format for information about the fusion, identical parameters etc., and the whole SYCLKernelInfo/SYCLModuleInfo infrastructure in Kernel.h.
  2. Through SYCL application tests as part of llvm-test-suite, which test the whole fusion process as driven by the SYCL runtime. These tests are not yet part of this PR, as they require the logic connecting SYCL runtime and the sycl-fusion library to be upstreamed first (future PR).

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sommerlukas! LGTM!

@sommerlukas
Copy link
Contributor Author

Thanks for the approvals!

Can either of you please merge this PR, I'm not authorized to merge this, @steffenlarsen @pvchupin?

@pvchupin pvchupin merged commit 441bffe into intel:sycl Dec 5, 2022
againull pushed a commit that referenced this pull request Dec 16, 2022
This is the third patch in a series of patches to add an implementation
of the [kernel fusion
extension](#7098). We have split the
implementation into multiple patches to make them more easy to review.
This patch integrates the kernel fusion extension into the SYCL runtime
scheduler.

Next to collecting the kernels submitted while in fusion mode in the
fusion list associated with the queue, the integration into the
scheduler is also responsible for detecting the synchronization
scenarios. Various scenarios, such as buffer destruction or event wait,
require fusion to be aborted early. The full list of scenarios is
available in the [extension
proposal](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_codeplay_kernel_fusion.asciidoc#synchronization-in-the-sycl-application).

A high-level description of the integration into the scheduler can be
found in the [design document](#7204).

This PR can be reviewed and merged independently of
#7465.

Signed-off-by: Lukas Sommer <[email protected]>

Signed-off-by: Lukas Sommer <[email protected]>
SPIRVBinary &emplaceSPIRVBinary(std::string Binary);

private:
// FIXME: Change this to std::shared_mutex after switching to C++17.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have switched to C++17 as a minimal required version mid 2022.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants