Skip to content

[SYCL] LowerWGScope pass should not be skipped when -O0 is used #1155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 21, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 9 additions & 12 deletions clang/lib/CodeGen/SYCLLowerIR/LowerWGScope.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,10 @@
// local "shadow" variable. Before each PFWI invocation leader WI stores its
// private copy of the variable into the shadow (under "is leader" guard), then
// all WIs (ouside of "is leader" guard) load the shadow value into their
// private copies ("materialize" the private copy). This works becase these
// private copies ("materialize" the private copy). This works because these
// variables are uniform - i.e. have the same value in all WIs and are not
// changed within PFWI. The only exceptions are captures of private_memory
// isntances - see next.
// instances - see next.
// ** Kind 1:
// Even though WG-scope locals are supposed to be uniform, there is one
// exception - capture of local of kind 1. It is always captured by non-const
Expand All @@ -52,7 +52,7 @@
// of kind 1 variable's alloca is stored within the PFWI lambda.
// Materialization of the lambda object value writes result of alloca of the
// leader WI's private variable into the private copy of the lambda object,
// which is wrong. So for tese variables this pass adds a write of the private
// which is wrong. So for these variables this pass adds a write of the private
// variable's address into the private copy of the lambda object right after its
// materialization:
// if (is_leader())
Expand Down Expand Up @@ -120,9 +120,6 @@ class SYCLLowerWGScopeLegacyPass : public FunctionPass {

// run the LowerWGScope pass on the specified module
bool runOnFunction(Function &F) override {
if (skipFunction(F))
return false;

FunctionAnalysisManager FAM;
auto PA = Impl.run(F, FAM);
return !PA.areAllPreserved();
Expand Down Expand Up @@ -479,7 +476,7 @@ static void materializeLocalsInWIScopeBlocksImpl(
// Checks if there is a need to materialize value of given local in given work
// item-scope basic block.
static bool localMustBeMaterialized(const AllocaInst *L, const BasicBlock &BB) {
// TODO this is overly convervative - see speculations below.
// TODO this is overly conservative - see speculations below.
return true;
}

Expand Down Expand Up @@ -745,9 +742,9 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
for (; I->getOpcode() == Instruction::Alloca; I = I->getNextNode()) {
auto *AllocaI = dyn_cast<AllocaInst>(I);
// Allocas marked with "work_item_scope" are those originating from
// cl::sycl::private_memory<T> variables, which must in private. No
// shadows/materialization is needed for them because they can be updated
// only within PFWIs
// cl::sycl::private_memory<T> variables, which must be in private memory.
// No shadows/materialization is needed for them because they can be
// updated only within PFWIs
if (!AllocaI->getMetadata(WI_SCOPE_MD))
Allocas.insert(AllocaI);
}
Expand Down Expand Up @@ -801,7 +798,7 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
}
// There can be allocas not corresponding to any variable declared in user
// code but generated by the compiler - e.g. for non-trivially typed
// parameters passed by by value. There can be WG scope stores into such
// parameters passed by value. There can be WG scope stores into such
// allocas, which need to be made visible to all WIs. This is done via
// creating a "shadow" workgroup-shared variable and using it to propagate
// the value of the alloca'ed variable to worker WIs from the leader.
Expand All @@ -815,7 +812,7 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
// Now materialize the locals:
materializeLocalsInWIScopeBlocks(Allocas, WIScopeBBs);

// Fixup captured addresses of private_memory isntances in current WI
// Fixup captured addresses of private_memory instances in current WI
for (auto *PFWICall : PFWICalls)
fixupPrivateMemoryPFWILambdaCaptures(PFWICall);

Expand Down
6 changes: 6 additions & 0 deletions sycl/test/hier_par/hier_par_wgscope.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,12 @@
// RUN: %GPU_RUN_PLACEHOLDER %t.out
// RUN: %ACC_RUN_PLACEHOLDER %t.out

// RUN: %clangxx -O0 -fsycl %s -o %t.out
// RUN: env SYCL_DEVICE_TYPE=HOST %t.out
// RUN: %CPU_RUN_PLACEHOLDER %t.out
// RUN: %GPU_RUN_PLACEHOLDER %t.out
// RUN: %ACC_RUN_PLACEHOLDER %t.out

// This test checks correctness of hierarchical kernel execution when there is
// code and data in the work group scope.

Expand Down