-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Share PFWG lambda object through shared memory #1455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, few minor notes
@@ -679,10 +688,19 @@ static void fixupPrivateMemoryPFWILambdaCaptures(CallInst *PFWICall) { | |||
// Go through "byval" parameters which are passed as AS(0) pointers | |||
// and: (1) create local shadows for them (2) and initialize them from the | |||
// leader's copy and (3) replace usages with pointer to the shadow | |||
// | |||
// Do the same with PFWG lamda object parameter ('this' pointer) which is also | |||
// passed as AS(0) pointer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment that this is based on assumption that '*this' coming from PFWG is not modified down the callgraph. Otherwise a copy-back would be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, added the comment why copy back is not needed.
@@ -679,10 +688,19 @@ static void fixupPrivateMemoryPFWILambdaCaptures(CallInst *PFWICall) { | |||
// Go through "byval" parameters which are passed as AS(0) pointers | |||
// and: (1) create local shadows for them (2) and initialize them from the | |||
// leader's copy and (3) replace usages with pointer to the shadow | |||
// | |||
// Do the same with PFWG lamda object parameter ('this' pointer) which is also | |||
// passed as AS(0) pointer. | |||
static void shareByValParams(Function &F, const Triple &TT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the function purpose is extended, I suggest to change the name. E.g. sharePFWGPrivateObjects or something like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
RepVal = ConstantExpr::getPointerBitCastOrAddrSpaceCast(Shadow, | ||
Arg.getType()); | ||
} | ||
} else if (Arg.getArgNo() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment that this is 'this' processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
; CHECK-NEXT: [[TMP0:%.*]] = load i64, i64 addrspace(1)* @__spirv_BuiltInLocalInvocationIndex | ||
; CHECK-NEXT: [[CMPZ3:%.*]] = icmp eq i64 [[TMP0]], 0 | ||
; CHECK-NEXT: br i1 [[CMPZ3]], label [[LEADER:%.*]], label [[MERGE:%.*]] | ||
; CHECK: leader: | ||
; CHECK-NEXT: [[TMP1:%.*]] = bitcast %struct.zot* [[ARG1:%.*]] to i8* | ||
; CHECK-NEXT: call void @llvm.memcpy.p3i8.p0i8.i64(i8 addrspace(3)* align 16 bitcast (%struct.zot addrspace(3)* @[[GROUP_SHADOW]] to i8 addrspace(3)*), i8* align 8 [[TMP1]], i64 96, i1 false) | ||
; CHECK-NEXT: call void @llvm.memcpy.p3i8.p0i8.i64(i8 addrspace(3)* align 16 bitcast (%struct.zot addrspace(3)* @ArgShadow.2 to i8 addrspace(3)*), i8* align 8 [[TMP1]], i64 96, i1 false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider replacing ArgShadow.2
with a match (added above at the declaration point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
In the current implementation private address of the PFWG lambda object is shared by leader work item through local memory to other work items. This is not correct. That is why perform the copy of the PFWG lambda object to shared memory and make work items work with address of the object in shared memory. I.e. this case should be handled in the similar way as for byval parameters. Signed-off-by: Artur Gainullin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…_private_api * origin/sycl: (614 commits) [SYCL][Doc] Update prerequisites in GetStartedGuide (intel#1466) [SYCL][USM] Remove vestigial dead code (intel#1474) [SYCL-PTX] Fix __spirv_GroupAsyncCopy stride computation (intel#1451) [Driver][SYCL] Emit an error if c compilation is forced (intel#1438) [SYCL] Fix sycl-post-link when no split and symbols are requested. (intel#1454) [SYCL] Change priority of devices in default_selector (intel#1264) [CI] Update CODEOWNERS matching rules order (intel#1468) [SYCL] Share PFWG lambda object through shared memory (intel#1455) [CI] Fix CODEOWNERS file syntax (intel#1464) [SYCL][CUDA] Fix active context when creating base event (intel#1447) [SYCL] Diagnose implicit declaration of kernel function type (intel#1450) [BuildBot] Modify configure script (intel#1421) [SYCL] Resolve min/max conflict (intel#1339) [CI][BuildBot] Fix configure parameter to turn on/off assertions (intel#1449) [SYCL] XFAIL LIT test due to duplicate diagnostic [SYCL] Remove explicit sycl_device attribute requirement Apply more suggestions Apply suggestions Translate new set of Intel FPGA Loop Controls Translate Intel FPGA force_pow2_depth memory attribute ...
…c_abi_checks * origin/sycl: (625 commits) [SYCL][Test] Disable spec_const_redefine.cpp on all devices but HOST (intel#1488) [SYCL] Only export public API (intel#1456) [SYCL][CUDA] Fix selected_binary argument in piextDeviceSelectBinary (intel#1475) [SYCL] Enable LIT testing with CUDA BE (intel#1458) [SYCL] Fix float to half-type conversion (intel#1395) [NFC] Cleanup unneded macro from builtins implementation (intel#1445) Enable cfg-printer LLVM lit tests only if LLVM linked statically (intel#1479) [SYCL][NFC] Reflect the "allowlist" renaming in the code (intel#1480) [SYCL][Doc] Update prerequisites in GetStartedGuide (intel#1466) [SYCL][USM] Remove vestigial dead code (intel#1474) [SYCL-PTX] Fix __spirv_GroupAsyncCopy stride computation (intel#1451) [Driver][SYCL] Emit an error if c compilation is forced (intel#1438) [SYCL] Fix sycl-post-link when no split and symbols are requested. (intel#1454) [SYCL] Change priority of devices in default_selector (intel#1264) [CI] Update CODEOWNERS matching rules order (intel#1468) [SYCL] Share PFWG lambda object through shared memory (intel#1455) [CI] Fix CODEOWNERS file syntax (intel#1464) [SYCL][CUDA] Fix active context when creating base event (intel#1447) [SYCL] Diagnose implicit declaration of kernel function type (intel#1450) [BuildBot] Modify configure script (intel#1421) ...
…sm (#6212) This patch refactors #1455 to avoid uses of deprecated `getPointerElementType` function. #1455 introduces the code that uses pointer type information to create a shadow copy of SYCL kernel object. The same can be achieved by applying `work-group` scope attribute the SYCL kernel object. Compiler allocates such object in local address space, so object is shared among all work-items in the work-group.
In the current implementation private address of the PFWG lambda object
is shared by leader work item through local memory to other work items.
This is not correct. That is why perform the copy of the PFWG lambda
object to shared memory and make work items work with address of the
object in shared memory. I.e. this case should be handled in the
similar way as for byval parameters.
Signed-off-by: Artur Gainullin [email protected]