[SYCL] Refactor SYCL kernel object handling in hierarchical parallelism #6212

bader · 2022-05-29T15:50:44Z

This patch refactors #1455 to avoid uses of deprecated getPointerElementType function.
#1455 introduces the code that uses pointer type information to create a shadow copy of SYCL kernel object.

The same can be achieved by applying work-group scope attribute the SYCL kernel object. Compiler allocates such object in local address space, so object is shared among all work-items in the work-group.

Author: againull <[email protected]> Date: Fri Apr 3 00:59:46 2020 -0700 [SYCL] Share PFWG lambda object through shared memory (intel#1455) In the current implementation private address of the PFWG lambda object is shared by leader work item through local memory to other work items. This is not correct. That is why perform the copy of the PFWG lambda object to shared memory and make work items work with address of the object in shared memory. I.e. this case should be handled in the similar way as for byval parameters. Signed-off-by: Artur Gainullin <[email protected]>

bader · 2022-05-29T15:50:56Z

/summary:run

bader · 2022-05-30T16:15:10Z

@kbobrovs, @againull, I think I hit another bug/limitation of the pass. The pass doesn't look through a function calls when it analyses the execution scope i.e. work-group vs work-item.

void foo(sycl::group<1> group, ...) {
  group.parallel_for_work_item(range<1>(), [&](h_item<1> i) { ... });
}
...

  cgh.parallel_for_work_group<class kernel>(
    range<1>(...), range<1>(...), [=](group<1> g) {
      foo(g, ...);
    });

The pass emits the code to call foo once per work-group, but I can't find anything like this in the specification.
@intel/dpcpp-specification-reviewers, what is the expected behavior in this case?

kbobrovs · 2022-05-31T12:27:19Z

@bader, yes, this is a limitation of the pass. It should have been added as a TODO. As I recall, it was considered not practical to spend resourced on adding support for such scenarios. Possible solution is known.

bader · 2022-05-31T12:59:28Z

@bader, yes, this is a limitation of the pass. It should have been added as a TODO. As I recall, it was considered not practical to spend resourced on adding support for such scenarios. Possible solution is known.

Thanks for the update. I tried to mark generated kernel function with work-group scope attribute, so that LowerWGScope pass will put parallel_for_work_group lambda object into local memory, but it also puts parallel_for_work_item one layer down in the call stack.

I think I'll try another idea. I'm going to change the pass to process kernel functions calling functions with work-group scope attribute in addition to just functions marked with work-group scope attribute. I'll move the code added by #1455 to the portion processing kernel functions.

This reverts commit 0e13ee9.

Kernel objects passed to parallel_for_work_group function must be shared among all work-items withing a work-group.

clang/lib/Sema/SemaSYCL.cpp

llvm/lib/SYCLLowerIR/LowerWGScope.cpp

kbobrovs · 2022-06-09T09:52:50Z

llvm/lib/SYCLLowerIR/LowerWGScope.cpp

-    LLVMContext &Ctx = At.getContext();
-    IRBuilder<> Builder(Ctx);
-    Builder.SetInsertPoint(&LeaderBB->front());
+    if (!Arg.hasByValAttr())


Nit: we skip "this" because it is allocated in the proper AS by the FE, correct? Comment would be helpful for the reader.

Right. I just reverted the changes from #1455 and tried to re-implement it by fixing address space in clang instead.
Do you want me to comment that this points to the object in local address space, so we don't need a shadow copy for that argument?

kbobrovs

LowerWGScope.cpp LGTM

clang/lib/Sema/SemaSYCL.cpp

…of it.

clang/lib/Sema/SemaSYCL.cpp

elizabethandrews

Thanks or the detailed review @erichkeane and for the change @bader. FE changes LGTM

bader added 4 commits May 29, 2022 07:56

Update pass tests.

6c60344

Improve debug infomation for LowerWGScope pass.

3530e1d

Mark auto-generated kernel with work-group metadata.

0e13ee9

Fix formatting.

e1511a7

bader added 4 commits May 31, 2022 06:07

Add TODO comment for a future improvement.

eb6d2d7

Revert "Mark auto-generated kernel with work-group metadata."

969b468

This reverts commit 0e13ee9.

Add work-group scope attribute to the SYCL kernel

ca26f57

Kernel objects passed to parallel_for_work_group function must be shared among all work-items withing a work-group.

clang-format

1a61c46

bader marked this pull request as ready for review June 7, 2022 19:03

bader requested review from a team as code owners June 7, 2022 19:03

bader commented Jun 7, 2022

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

bader changed the title ~~[SYCL] Refactor lower work-group scope pass~~ [SYCL] Refactor SYCK kernel object handling in hierarchical parallelism Jun 7, 2022

bader changed the title ~~[SYCL] Refactor SYCK kernel object handling in hierarchical parallelism~~ [SYCL] Refactor SYCL kernel object handling in hierarchical parallelism Jun 8, 2022

bader requested review from againull and kbobrovs June 8, 2022 15:21

kbobrovs reviewed Jun 9, 2022

View reviewed changes

llvm/lib/SYCLLowerIR/LowerWGScope.cpp Show resolved Hide resolved

kbobrovs reviewed Jun 9, 2022

View reviewed changes

kbobrovs previously approved these changes Jun 9, 2022

View reviewed changes

Set anonymous struct flag for lambda types only. Small refactoring.

b970a77

bader dismissed kbobrovs’s stale review via b970a77 June 12, 2022 14:35

bader requested review from erichkeane, premanandrao and elizabethandrews June 12, 2022 14:37

Remove unused function parameter.

3886d8d

erichkeane reviewed Jun 13, 2022

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

Apply code review suggestion.

959e1aa

erichkeane reviewed Jun 13, 2022

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

Give a name to SYCL kernel object to make mangle-able static version …

2d987ac

…of it.

bader requested a review from erichkeane June 14, 2022 17:35

erichkeane reviewed Jun 14, 2022

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

Apply code review comments.

fdeaab6

bader requested a review from erichkeane June 14, 2022 18:00

erichkeane reviewed Jun 14, 2022

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

Change getOwn to get.

0a5834b

elizabethandrews approved these changes Jun 14, 2022

View reviewed changes

smanna12 approved these changes Jun 14, 2022

View reviewed changes

bader merged commit 0c7a1e1 into intel:sycl Jun 15, 2022

bader deleted the lower-wg-scope-refactor branch June 15, 2022 08:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Refactor SYCL kernel object handling in hierarchical parallelism #6212

[SYCL] Refactor SYCL kernel object handling in hierarchical parallelism #6212

bader commented May 29, 2022 •

edited

Loading

bader commented May 29, 2022

bader commented May 30, 2022

kbobrovs commented May 31, 2022

bader commented May 31, 2022

kbobrovs Jun 9, 2022 •

edited

Loading

bader Jun 9, 2022

kbobrovs left a comment

elizabethandrews left a comment

[SYCL] Refactor SYCL kernel object handling in hierarchical parallelism #6212

[SYCL] Refactor SYCL kernel object handling in hierarchical parallelism #6212

Conversation

bader commented May 29, 2022 • edited Loading

bader commented May 29, 2022

bader commented May 30, 2022

kbobrovs commented May 31, 2022

bader commented May 31, 2022

kbobrovs Jun 9, 2022 • edited Loading

Choose a reason for hiding this comment

bader Jun 9, 2022

Choose a reason for hiding this comment

kbobrovs left a comment

Choose a reason for hiding this comment

elizabethandrews left a comment

Choose a reason for hiding this comment

bader commented May 29, 2022 •

edited

Loading

kbobrovs Jun 9, 2022 •

edited

Loading