Skip to content

[SYCL] Do not emit unneeded static initializations in sycl device code #1774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 8, 2020

Conversation

premanandrao
Copy link
Contributor

Signed-off-by: Premanand M Rao [email protected]

@@ -4687,6 +4687,10 @@ void CodeGenModule::HandleCXXStaticMemberVarInstantiation(VarDecl *VD) {
if (DK == VarDecl::Definition && VD->hasAttr<DLLImportAttr>())
return;

// Emit only those declarations that must be, when on SYCL device.
if (LangOpts.SYCLIsDevice && !MustBeEmitted(VD))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be the right place for this. This is the initialization, not when the variable itself is being emitted.

Can you backtrace a little to see what is causing the static member to be emitted, and do the test there?

Also, do we have the same problem with globals (non-members)? How about magic-statics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's happening with this is that the static variable emission (as well as globals and function-local statics) is being throttled using the usual mechanism in CodeGen and AST routines. However, the initialization is being generated via a call from Sema:HandleCXXStaticMemberVarInstantiation() and the initialization later forces the variable to be emitted as well (since it gets used).

I have moved the check earlier in Sema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point out the place where "initialization later forces the variable to be emitted as well"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CodeGenModule.cpp:

  StringRef MangledName = getMangledName(GD);
  if (GetGlobalValue(MangledName) != nullptr) {
    // The value has already been used and should therefore be emitted.
    addDeferredDeclToEmit(GD);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And could you please point how exactly initialization makes variable used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is called from Sema: SemaTemplateInstantiateDecl.cpp via PassToConsumeRAII but there isn't enough information there to suppress it. I tried, but incorrectly; I couldn't figure out what property of Var I could use there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we SHOULD be marking these vardecls as 'used-by-sycl' in some manner. If they aren't referenced in a kernel, we probably shouldn't be instantiating them, right?

That way we could just suppress the instantiate-variable-definition in PerformPendingInstantiations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, actually... If these are static-data-members, don't we KNOW that they can't be emitted? We have an error that prevents them from being used, right? So 'Is static member && SYCL' should be enough. Either at the beginning of InstantiateVariableDefinition or in performpendinginstantiations, though I'm not sure where would be better. See 6096 for how to check static-data-member.

That said, do Variable Templates cause us problems here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We diagnose non-const static data members. I will check out the other places you mentioned.

I haven't specifically tested variable templates, will do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Does this patch make sure that this gets emitted anyway if it is used then? I'd hate for us to suppress this here, but have it be used later/before.

erichkeane
erichkeane previously approved these changes May 28, 2020
Comment on lines 5153 to 5155
if (getLangOpts().SYCLIsDevice && !Var->hasAttr<SYCLKernelAttr>() &&
!Var->hasAttr<SYCLDeviceAttr>())
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff doesn't seem right. That is probably why tests are failing.
sycl_kernel attribute can be applied only to a function template.
Only SYCL_EXTERNAL functions can have sycl_device attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I realize that now. Can you suggest a better approach? I could go back to my original change in CodeGenModule.cpp

erichkeane
erichkeane previously approved these changes Jun 2, 2020
erichkeane
erichkeane previously approved these changes Jun 2, 2020
@premanandrao premanandrao self-assigned this Jun 2, 2020
@Fznamznon
Copy link
Contributor

Looks like this patch breaks pipes. @MrSidims , could you please take a look at the test logs? Have you seen such errors before?

@MrSidims
Copy link
Contributor

MrSidims commented Jun 3, 2020

Looks like this patch breaks pipes. @MrSidims , could you please take a look at the test logs? Have you seen such errors before?

I'll take a look, we might need to have to do some staggered changes.

@MrSidims
Copy link
Contributor

MrSidims commented Jun 3, 2020

Looks like this patch breaks pipes. @MrSidims , could you please take a look at the test logs? Have you seen such errors before?

I'll take a look, we might need to have to do some staggered changes.

According to log the failure came from FPGA emulator backend's pass, that recognizes SYCL pipes. I'm currently building the debug version of the emulator locally to reproduce, but...
...that looks weird, after applying the patch I see no changes but clang version metadata in LLVM IR module generated for fpga_pipes.cpp test, hence FPGA emulator backend flow for pipes shan't be affected. At the same time the same tests are passing on windows. Will dig more into this when build is complete.

@premanandrao
Copy link
Contributor Author

Thank you @MrSidims , appreciate your help.

@premanandrao premanandrao force-pushed the remote_staticinit branch 2 times, most recently from 404c964 to 0a562f4 Compare June 3, 2020 22:04
@premanandrao
Copy link
Contributor Author

@Fznamznon and @erichkeane, if either of you could review my latest set of changes, I'd appreciate it.

Copy link
Contributor

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stop force-pushing, it makes reviews really difficult. Second, @MrSidims , please take a look and make sure you're happy.

@MrSidims
Copy link
Contributor

MrSidims commented Jun 4, 2020

Second, @MrSidims , please take a look and make sure you're happy.

Yeah, I happy.

LGTM

@premanandrao
Copy link
Contributor Author

Please stop force-pushing, it makes reviews really difficult. Second, @MrSidims , please take a look and make sure you're happy.

Sorry. I will talk offline with you and figure out the BKM when regular push fails.

if (I.AssociatedData->getType()->getPointerAddressSpace() !=
VoidPtrTy->getAddressSpace())
ctor.add(
llvm::ConstantExpr::getAddrSpaceCast(I.AssociatedData, VoidPtrTy));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are adding a new address space cast, but I don't see any address space casts in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will modify the test to show this.

Comment on lines 1279 to 1280
if (I.AssociatedData->getType()->getPointerAddressSpace() !=
VoidPtrTy->getAddressSpace())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that it is correct to do address space cast here?
Like this check

static void ensureSameAddrSpace(Value *&RHS, Value *&LHS,

@asavonic ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we have to ensure that an addrspace cast is valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will add this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cases above, it looks like they values are both cast to the generic address space, to ensure sameness. This does not seem to be needed here, as the cast is to address space 0. However, I have changed the call to use an existing routine (which does the check as well) which is called in few other places, instead of checking it directly how I had it before.

Copy link
Contributor

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this, but @Fznamznon had a pending question? Was that answer what you were looking for?

@bader bader merged commit 88d2031 into intel:sycl Jun 8, 2020
premanandrao added a commit to premanandrao/llvm that referenced this pull request Jul 1, 2020
…alized

Const static data members need to be either zero-initialized or
constant-initialized.  We were emitting all const static data
members before, which needed address space casts from constant-space
to private-space.

This change also reverts the address-space cast introduced in
intel#1774

Signed-off-by: Premanand M Rao <[email protected]>
premanandrao added a commit to premanandrao/llvm that referenced this pull request Jul 3, 2020
Const static data members need to be either zero-initialized or
constant-initialized.  We were emitting initializers for all const
static data members before, which needed invalid address space casts
from constant-space to private-space.

This change also reverts the address-space cast introduced in
intel#1774

Signed-off-by: Premanand M Rao <[email protected]>
premanandrao added a commit to premanandrao/llvm that referenced this pull request Jul 4, 2020
Const static variables need to be either zero-initialized or
constant-initialized.  We were emitting initializers for all const
static variables before, which needed invalid address space casts
from constant-space to private-space.  We now diagnose when they
are neither zero- nor constant-initialized and used in device code.

This change also reverts the address-space cast introduced in
intel#1774

Signed-off-by: Premanand M Rao <[email protected]>
FreddyLeaf pushed a commit to FreddyLeaf/llvm that referenced this pull request Mar 22, 2023
This PR adds the new execution mode RegisterMapInterfaceINTEL, see the Khronos
SPIRV spec here: KhronosGroup/SPIRV-Registry#176
This execution mode allows specifying a 'register' based interface for FPGA kernels.

The RegisterMapInterfaceINTEL execution mode is added with a 0/1 literal based on
the kernel metadata. When the metadata is:

!ip_interface !N
!N = !{!"csr"}
The translator emits RegisterMapInterfaceINTEL 0, and when the metadata is:

!ip_interface !N
!N = !{!"csr", !"accept_downstream_stall"}
The translator emits RegisterMapInterfaceINTEL 1

NOTE
The new mode is under capability FPGAKernelAttributesv2INTEL which implicitly defines the capability FPGAKernelAttributesv2INTEL.
This PR is very similar to Implement StreamingInterfaceINTEL execution mode intel#1218

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@a9f4f25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants