-
Notifications
You must be signed in to change notification settings - Fork 768
[SYCL] Do not emit unneeded static initializations in sycl device code #1774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4eeb540
to
ce3a81f
Compare
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
@@ -4687,6 +4687,10 @@ void CodeGenModule::HandleCXXStaticMemberVarInstantiation(VarDecl *VD) { | |||
if (DK == VarDecl::Definition && VD->hasAttr<DLLImportAttr>()) | |||
return; | |||
|
|||
// Emit only those declarations that must be, when on SYCL device. | |||
if (LangOpts.SYCLIsDevice && !MustBeEmitted(VD)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be the right place for this. This is the initialization, not when the variable itself is being emitted.
Can you backtrace a little to see what is causing the static member to be emitted, and do the test there?
Also, do we have the same problem with globals (non-members)? How about magic-statics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's happening with this is that the static variable emission (as well as globals and function-local statics) is being throttled using the usual mechanism in CodeGen and AST routines. However, the initialization is being generated via a call from Sema:HandleCXXStaticMemberVarInstantiation() and the initialization later forces the variable to be emitted as well (since it gets used).
I have moved the check earlier in Sema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point out the place where "initialization later forces the variable to be emitted as well"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In CodeGenModule.cpp:
StringRef MangledName = getMangledName(GD);
if (GetGlobalValue(MangledName) != nullptr) {
// The value has already been used and should therefore be emitted.
addDeferredDeclToEmit(GD);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And could you please point how exactly initialization makes variable used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is called from Sema: SemaTemplateInstantiateDecl.cpp via PassToConsumeRAII but there isn't enough information there to suppress it. I tried, but incorrectly; I couldn't figure out what property of Var I could use there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we SHOULD be marking these vardecls as 'used-by-sycl' in some manner. If they aren't referenced in a kernel, we probably shouldn't be instantiating them, right?
That way we could just suppress the instantiate-variable-definition in PerformPendingInstantiations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, actually... If these are static-data-members, don't we KNOW that they can't be emitted? We have an error that prevents them from being used, right? So 'Is static member && SYCL' should be enough. Either at the beginning of InstantiateVariableDefinition or in performpendinginstantiations, though I'm not sure where would be better. See 6096 for how to check static-data-member.
That said, do Variable Templates cause us problems here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We diagnose non-const static data members. I will check out the other places you mentioned.
I haven't specifically tested variable templates, will do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Does this patch make sure that this gets emitted anyway if it is used then? I'd hate for us to suppress this here, but have it be used later/before.
ce3a81f
to
9bb619d
Compare
if (getLangOpts().SYCLIsDevice && !Var->hasAttr<SYCLKernelAttr>() && | ||
!Var->hasAttr<SYCLDeviceAttr>()) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stuff doesn't seem right. That is probably why tests are failing.
sycl_kernel
attribute can be applied only to a function template.
Only SYCL_EXTERNAL
functions can have sycl_device
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I realize that now. Can you suggest a better approach? I could go back to my original change in CodeGenModule.cpp
9bb619d
to
beb531e
Compare
beb531e
to
ca426ec
Compare
ca426ec
to
5a89f11
Compare
Looks like this patch breaks pipes. @MrSidims , could you please take a look at the test logs? Have you seen such errors before? |
I'll take a look, we might need to have to do some staggered changes. |
According to log the failure came from FPGA emulator backend's pass, that recognizes SYCL pipes. I'm currently building the debug version of the emulator locally to reproduce, but... |
Thank you @MrSidims , appreciate your help. |
404c964
to
0a562f4
Compare
Signed-off-by: Premanand M Rao <[email protected]>
0a562f4
to
96ef141
Compare
@Fznamznon and @erichkeane, if either of you could review my latest set of changes, I'd appreciate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please stop force-pushing, it makes reviews really difficult. Second, @MrSidims , please take a look and make sure you're happy.
Yeah, I happy. LGTM |
Sorry. I will talk offline with you and figure out the BKM when regular push fails. |
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
if (I.AssociatedData->getType()->getPointerAddressSpace() != | ||
VoidPtrTy->getAddressSpace()) | ||
ctor.add( | ||
llvm::ConstantExpr::getAddrSpaceCast(I.AssociatedData, VoidPtrTy)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are adding a new address space cast, but I don't see any address space casts in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will modify the test to show this.
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
if (I.AssociatedData->getType()->getPointerAddressSpace() != | ||
VoidPtrTy->getAddressSpace()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that it is correct to do address space cast here?
Like this check
llvm/clang/lib/CodeGen/CGExprScalar.cpp
Line 2996 in ccefc93
static void ensureSameAddrSpace(Value *&RHS, Value *&LHS, |
@asavonic ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have to ensure that an addrspace cast is valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will add this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the cases above, it looks like they values are both cast to the generic address space, to ensure sameness. This does not seem to be needed here, as the cast is to address space 0. However, I have changed the call to use an existing routine (which does the check as well) which is called in few other places, instead of checking it directly how I had it before.
Signed-off-by: Premanand M Rao <[email protected]>
Signed-off-by: Premanand M Rao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with this, but @Fznamznon had a pending question? Was that answer what you were looking for?
…alized Const static data members need to be either zero-initialized or constant-initialized. We were emitting all const static data members before, which needed address space casts from constant-space to private-space. This change also reverts the address-space cast introduced in intel#1774 Signed-off-by: Premanand M Rao <[email protected]>
Const static data members need to be either zero-initialized or constant-initialized. We were emitting initializers for all const static data members before, which needed invalid address space casts from constant-space to private-space. This change also reverts the address-space cast introduced in intel#1774 Signed-off-by: Premanand M Rao <[email protected]>
Const static variables need to be either zero-initialized or constant-initialized. We were emitting initializers for all const static variables before, which needed invalid address space casts from constant-space to private-space. We now diagnose when they are neither zero- nor constant-initialized and used in device code. This change also reverts the address-space cast introduced in intel#1774 Signed-off-by: Premanand M Rao <[email protected]>
This PR adds the new execution mode RegisterMapInterfaceINTEL, see the Khronos SPIRV spec here: KhronosGroup/SPIRV-Registry#176 This execution mode allows specifying a 'register' based interface for FPGA kernels. The RegisterMapInterfaceINTEL execution mode is added with a 0/1 literal based on the kernel metadata. When the metadata is: !ip_interface !N !N = !{!"csr"} The translator emits RegisterMapInterfaceINTEL 0, and when the metadata is: !ip_interface !N !N = !{!"csr", !"accept_downstream_stall"} The translator emits RegisterMapInterfaceINTEL 1 NOTE The new mode is under capability FPGAKernelAttributesv2INTEL which implicitly defines the capability FPGAKernelAttributesv2INTEL. This PR is very similar to Implement StreamingInterfaceINTEL execution mode intel#1218 Original commit: KhronosGroup/SPIRV-LLVM-Translator@a9f4f25
Signed-off-by: Premanand M Rao [email protected]