Skip to content

Commit fd8ae8a

Browse files
committed
[SYCL] LowerWGScope pass should not be skipped when -O0 is used
LowerWGScope pass performs required transformations to enable hierarchical parallelism semantics. This pass should not be skipped even if optimizations are disabled. Also some typos in the comments are fixed. Signed-off-by: Artur Gainullin <[email protected]>
1 parent 0438422 commit fd8ae8a

File tree

2 files changed

+15
-12
lines changed

2 files changed

+15
-12
lines changed

clang/lib/CodeGen/SYCLLowerIR/LowerWGScope.cpp

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,10 @@
4040
// local "shadow" variable. Before each PFWI invocation leader WI stores its
4141
// private copy of the variable into the shadow (under "is leader" guard), then
4242
// all WIs (ouside of "is leader" guard) load the shadow value into their
43-
// private copies ("materialize" the private copy). This works becase these
43+
// private copies ("materialize" the private copy). This works because these
4444
// variables are uniform - i.e. have the same value in all WIs and are not
4545
// changed within PFWI. The only exceptions are captures of private_memory
46-
// isntances - see next.
46+
// instances - see next.
4747
// ** Kind 1:
4848
// Even though WG-scope locals are supposed to be uniform, there is one
4949
// exception - capture of local of kind 1. It is always captured by non-const
@@ -52,7 +52,7 @@
5252
// of kind 1 variable's alloca is stored within the PFWI lambda.
5353
// Materialization of the lambda object value writes result of alloca of the
5454
// leader WI's private variable into the private copy of the lambda object,
55-
// which is wrong. So for tese variables this pass adds a write of the private
55+
// which is wrong. So for these variables this pass adds a write of the private
5656
// variable's address into the private copy of the lambda object right after its
5757
// materialization:
5858
// if (is_leader())
@@ -120,9 +120,6 @@ class SYCLLowerWGScopeLegacyPass : public FunctionPass {
120120

121121
// run the LowerWGScope pass on the specified module
122122
bool runOnFunction(Function &F) override {
123-
if (skipFunction(F))
124-
return false;
125-
126123
FunctionAnalysisManager FAM;
127124
auto PA = Impl.run(F, FAM);
128125
return !PA.areAllPreserved();
@@ -479,7 +476,7 @@ static void materializeLocalsInWIScopeBlocksImpl(
479476
// Checks if there is a need to materialize value of given local in given work
480477
// item-scope basic block.
481478
static bool localMustBeMaterialized(const AllocaInst *L, const BasicBlock &BB) {
482-
// TODO this is overly convervative - see speculations below.
479+
// TODO this is overly conservative - see speculations below.
483480
return true;
484481
}
485482

@@ -745,9 +742,9 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
745742
for (; I->getOpcode() == Instruction::Alloca; I = I->getNextNode()) {
746743
auto *AllocaI = dyn_cast<AllocaInst>(I);
747744
// Allocas marked with "work_item_scope" are those originating from
748-
// cl::sycl::private_memory<T> variables, which must in private. No
749-
// shadows/materialization is needed for them because they can be updated
750-
// only within PFWIs
745+
// cl::sycl::private_memory<T> variables, which must be in private memory.
746+
// No shadows/materialization is needed for them because they can be
747+
// updated only within PFWIs
751748
if (!AllocaI->getMetadata(WI_SCOPE_MD))
752749
Allocas.insert(AllocaI);
753750
}
@@ -801,7 +798,7 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
801798
}
802799
// There can be allocas not corresponding to any variable declared in user
803800
// code but generated by the compiler - e.g. for non-trivially typed
804-
// parameters passed by by value. There can be WG scope stores into such
801+
// parameters passed by value. There can be WG scope stores into such
805802
// allocas, which need to be made visible to all WIs. This is done via
806803
// creating a "shadow" workgroup-shared variable and using it to propagate
807804
// the value of the alloca'ed variable to worker WIs from the leader.
@@ -815,7 +812,7 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
815812
// Now materialize the locals:
816813
materializeLocalsInWIScopeBlocks(Allocas, WIScopeBBs);
817814

818-
// Fixup captured addresses of private_memory isntances in current WI
815+
// Fixup captured addresses of private_memory instances in current WI
819816
for (auto *PFWICall : PFWICalls)
820817
fixupPrivateMemoryPFWILambdaCaptures(PFWICall);
821818

sycl/test/hier_par/hier_par_wgscope.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,12 @@
1212
// RUN: %GPU_RUN_PLACEHOLDER %t.out
1313
// RUN: %ACC_RUN_PLACEHOLDER %t.out
1414

15+
// RUN: %clangxx -O0 -fsycl %s -o %t.out
16+
// RUN: env SYCL_DEVICE_TYPE=HOST %t.out
17+
// RUN: %CPU_RUN_PLACEHOLDER %t.out
18+
// RUN: %GPU_RUN_PLACEHOLDER %t.out
19+
// RUN: %ACC_RUN_PLACEHOLDER %t.out
20+
1521
// This test checks correctness of hierarchical kernel execution when there is
1622
// code and data in the work group scope.
1723

0 commit comments

Comments
 (0)