Skip to content

[SYCL][ESIMD] Pass to replace simd* parameters with native llvm vectors. #2097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 28, 2020

Conversation

kbobrovs
Copy link
Contributor

This pass is needed for the ESIMD backend to generate correct code.

Author: Pratik Ashar [email protected]

Signed-off-by: Konstantin S Bobrovsky [email protected]

@pratikashar, please pick up the review

This pass is needed for the ESIMD backend to generate correct code.

Author: Pratik Ashar <[email protected]>

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@@ -9,6 +9,6 @@ __attribute__((opencl_private)) __attribute__((register_num(17))) int vc;

SYCL_EXTERNAL void init_vc(int x) {
vc = x;
// CHECK: store i32 %0, i32* @vc
// CHECK: store i32 %x, i32* @vc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, registers are not named if assertions are disabled.
I suggest using FileCheck pattern matching variables to make it work in both modes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbobrovs - this is a test for Gang's global variable GRF binding. i dont think this test validates the pass under review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is unrelated change - fixed.

@@ -0,0 +1,430 @@
//===-- ESIMDVecArgPass.cpp - lower Close To Metal (CM) constructs --------===//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add a regression test for this pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've some tests that use this functionality. @kbobrovs - are we also planning on checking in esimd unit tests? we've subroutine.cpp that exercises this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pratikashar, there should be a test similar to llvm/test/SYCLLowerIR/esimd_lower_intrins.ll which does not depend on anything else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbobrovs - added llvm/test/SYCLLowerIR/esimd_subroutine.ll

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pratikashar, there are only few CHECKs, and I think the test could be made much simpler. E.g. _ZN4simdIiLi16EEC2ERS0_ and a call to it removed, _Z3fooi reduced. @bader - are you OK with the test?

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
// Change in function parameter type from simd* to native llvm vector type for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this isn't done by clang's AST/CodeGen library?
Or even by using vector extension in the SYCL headers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it legal for clang's CG to change type of parameter from simd* to say *?

Copy link
Contributor

@pratikashar pratikashar Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simd* to say *?

simd * to say < i16x32 >*?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're referring to attribute((ext_vector_type(N)) then we do use this to declare member of class simd. the sole data member of class simd is of vector type with this attribute applied. simd class has several other methods implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it legal for clang's CG to change type of parameter from simd* to say *?

If it's legal for LLVM pass it should be legal for any other compiler component. According to my understanding changing type at CodeGen should much easier to maintain than in LLVM pass. CodeGen lowers AST types into LLVM types, so it "changes" type already. According to my understanding changing LLVM type by pass is much more difficult to do.

if you're referring to attribute((ext_vector_type(N)) then we do use this to declare member of class simd. the sole data member of class simd is of vector type with this attribute applied. simd class has several other methods implemented.

Why do we expect llvm.genx.vload.v16i32.p4v16i32 intrinsic to accept simd class as a parameter instead of vector type data member? Could you point to the SYCL headers source code where this intrinsic is used, please? I'd like to check if it's possible to handle the problem in headers instead of LLVM pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lowering is required by the backend. More work done in offline compilation means less time spent in online compilation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it requires any additional work impacting "compilation time".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not do this here, we have to do the same transformation in the backend, that is extra online compilation time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you decide which transformation should be done offline or online?

I still believe that doing this transformation 'here' is bad design.
Alternative option:

  • user change the source code to use vector type instead of simd to get better performance. (I think comments are not quite accurate and the code with simd is correct, but back-end is not able to optimize it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restricting user from using simd type in function declaration will make the ESIMD extension less usable. Hence not a good choice to us. The alternative of this pass is to change all SIMD type to vector type during clang CodeGen (somewhere in this long review, I see you indicated that is doable and better). You are the clang expert

@kbobrovs kbobrovs added the esimd Explicit SIMD feature label Jul 13, 2020
Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@bader bader requested a review from andykaylor July 17, 2020 12:26
@bader
Copy link
Contributor

bader commented Jul 17, 2020

@andykaylor, I'm not sure if proposed change is aligned with the LLVM compiler architecture design. Could you clarify if it's okay from you POV, please?
Except that, I don't have other major concerns.

@kbobrovs
Copy link
Contributor Author

@andykaylor, I'm not sure if proposed change is aligned with the LLVM compiler architecture design. Could you clarify if it's okay from you POV, please?

@andykaylor, friendly ping

@andykaylor
Copy link
Contributor

@andykaylor, I'm not sure if proposed change is aligned with the LLVM compiler architecture design. Could you clarify if it's okay from you POV, please?

@andykaylor, friendly ping

I apologize for the delayed response. I hadn't seen the notification that my input was requested. I do have a lot to say about this, and I'm not sure it directly addresses Alexey's concerns, so please follow up as needed.

Let me start by saying that there is a long-standing plan for pointers to become typeless in LLVM IR. Much work has been done toward this, and though much more needs to be done that will eventually be the state of the LLVM IR language. More to the point, already in LLVM IR the type associated with pointers has no semantic meaning. The existence of functions like PointerType::getElementType() may lead you to believe that the type has semantic meaning, but it does not. If we have transformations that rely on knowing the types of pointer arguments, that is a problem. Any information that can't be deduced from the way in which memory is accessed should not be relied upon.

Having said that, the above statement is good news for the change under review here, though not necessarily the way it's currently implemented. As I understand it, the effect of this patch is to replace the type %class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd with <16 x i32> throughout the Module. The patch works downward from function arguments and global variables, but that's essentially the end result, right? Do we need to do a similar replacement with alloca values?

Any correspondance between C++ types and similar types in the LLVM IR type system is nothing more than an implementation detail. There is no requirement for these types to be kept connected. The clang front end creates LLVM IR types as needed to perform a literal translation of the source code, which will describe the program semantics. The back end is then free to transform the IR in any way that maintains the same semantics. The front end probably does need to keep the AST-level type information until it generates IR, so we probably don't want to change the type there. With regard to pointer values once we generate LLVM IR, the following types are all exactly equivalent:

%class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd*
<16 x i32>*
<8 x i64>*
<64 x i8>*
i8*

They are all pointers, and one of the fundamental guiding principles of LLVM IR is that memory does not have a type. So, what you're doing here is perfectly legal by LLVM IR rules.

As I said above, I do have concerns about the fact that we are relying on the types of pointers for certain transformations, but I'll put that aside for a moment and assume that perhaps the dependency isn't strictly as I've stated it and even if it is we can probably find another way to implement these transformations that removes the dependency.

The other thing I'd like to comment on, then, is the way this is implemented. You're starting from function arguments and global variables and then working down from there to update their uses. That will work fine for now, but when pointers become typeless it won't work. Global variables will, I think, continue to have a type associated with them in some way. Function arguments will not.

I think it might be better to structure the code now to examine the uses of arguments and global variables. Consider this function:

define void @f(%class._ZTS4simdIiLi16EE.simd* %x,
               %class._ZTS4simdIiLi16EE.simd* %y) {
entry:
  %x_data_ptr = getelementptr inbounds %class._ZTS4simdIiLi16EE.simd,
                                       %class._ZTS4simdIiLi16EE.simd* %x,
                                       i32 0, i32 0
  %x_data = load <16 x i32>, <16 x i32>* %x_data_ptr
  %y_data_ptr = getelementptr inbounds %class._ZTS4simdIiLi16EE.simd,
                                       %class._ZTS4simdIiLi16EE.simd* %y,
                                       i32 0, i32 0
  store <16 x i32> %x_data, <16 x i32>* %y_data_ptr
  ret void
}

With typeless pointers, that would look something like this:

define void @f(i8* %x,
               i8* %y) {
entry:
  %x_data_ptr = getelementptr inbounds %class._ZTS4simdIiLi16EE.simd,
                                       i8* %x,
                                       i32 0, i32 0
  %x_data = load <16 x i32>, i8* %x_data_ptr
  %y_data_ptr = getelementptr inbounds %class._ZTS4simdIiLi16EE.simd,
                                       i8* %y,
                                       i32 0, i32 0
  store <16 x i32> %x_data, i8* %y_data_ptr
  ret void
}

Currently the element type of the pointer argument matches the source element type operand of the GEP instruction, but the source element type operand only exists because that won't always be true. The same is true of the type operand of the load and store instructions.

In fact, there is nothing stopping someone from transforming the first IR into this today:

define void @f(i8* %x,
               i8* %y) {
entry:
  %x_data_ptr = bitcast i8* %x to <16 x i32>
  %x_data = load <16 x i32>, i8* %x_data_ptr
  %y_data_ptr = bitcast i8* %y to <16 x i32>
  store <16 x i32> %x_data, i8* %y_data_ptr
  ret void
}

or even this:

define void @f(i8* %xA,
               i8* %yA) {
entry:
  %x = bitcast i8* &xA to %class._ZTS4simdIiLi16EE.simd*
  %x_data_ptr = getelementptr inbounds %class._ZTS4simdIiLi16EE.simd,
                                       %class._ZTS4simdIiLi16EE.simd* %x,
                                       i32 0, i32 0
  %x_data = load <16 x i32>, <16 x i32>* %x_data_ptr
  %y = bitcast i8* &yA to %class._ZTS4simdIiLi16EE.simd*
  %y_data_ptr = getelementptr inbounds %class._ZTS4simdIiLi16EE.simd,
                                       %class._ZTS4simdIiLi16EE.simd* %y,
                                       i32 0, i32 0
  store <16 x i32> %x_data, <16 x i32>* %y_data_ptr
  ret void
}

My point is that the way the arguments are used gives you semantic information about what they point to. The pointer types do not.

; CHECK: define spir_func i32 {{.+}}bar{{.+}}(<16 x i32>* {{.+}}
define spir_func i32 @_Z3bar4simdIiLi16EE(%class._ZTS4simdIiLi16EE.simd* %v) #0 {
entry:
; CHECK: {{.+}} = bitcast <16 x i32>* {{.+}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a check of the updated function signature and a body of the function that accesses the arguments in some way. The uses of the updated arguments/globals are probably the most fragile part of this transformation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. i've updated esimd-subroutine.ll with more CHECK lines. these check whether simd* was converted to <16 x i32>. it also verifies that bitcasts are inserted and they use the right function argument. another CHECK statement verifies that dst of bitcast is used in the right place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1

; Function Attrs: norecurse nounwind
define linkonce_odr spir_func void @_ZN4simdIiLi16EEC2ERS0_(%class._ZTS4simdIiLi16EE.simd addrspace(4)* %this, %class._ZTS4simdIiLi16EE.simd addrspace(4)* align 64 dereferenceable(64) %other) unnamed_addr #0 comdat align 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some checks to verify that this function is updated correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. please see above comment.


// Return ptr to first-class vector type if Value is a simd*, else return
// nullptr.
Type *ESIMDLowerVecArgPass::argIsSimdPtr(Value *arg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: The name of this function is inconsistent with what it does. The name implies a Boolean return value (and I see it is used that way in one place) but since it returns a type I would expect a name like getSimdArgPtrTyOrNull().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Type *ESIMDLowerVecArgPass::argIsSimdPtr(Value *arg) {
auto ArgType = arg->getType();
if (ArgType->isPointerTy()) {
auto containedType = ArgType->getPointerElementType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: These three lines could be rewritten as:

if (auto *ArgType = dyn_cast<PointerType*>(arg->getType())) {
  auto *ContainedType = ArgType->getElementType()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.


if (isa<ConstantExpr>(V)) {
auto FirstOpnd = cast<ConstantExpr>(V)->getOperand(0);
return hasGlobalConstExpr(FirstOpnd, Global);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would Value::stripPointerCasts() or one of its variants achieve the same effect without recursion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i tried it and it worked. thanks for the tip. but eventually i did away with this function in latest patch.

// all such instances and replaces them with a new ConstantExpr
// consisting of new global vector* variable.
void ESIMDLowerVecArgPass::replaceConstExprWithGlobals(Module &M) {
for (auto &F : M) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very slow way of finding global variables in const expressions. How about using for (auto &GV : M.globals()) and then iterate over uses of the global variables?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

if (NewTy && !G.user_empty()) {
// Peel off ptr type that argIsSimdPtr applies
NewTy = NewTy->getPointerElementType();
auto ZeroInit = new APInt(32, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're assuming that the element type is i32, I think. The rest of the code has a general approach to the contained vector type. I think ConstantAggregateZero::get(NewTy) is what you want here.

Shouldn't you also be checking to see what kind of initializer was used for the original variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

i dont think static initialization is supported for on device code.

auto ZeroInit = new APInt(32, 0);
auto NewGlobalVar =
new GlobalVariable(NewTy, G.isConstant(), G.getLinkage(),
Constant::getIntegerValue(NewTy, *ZeroInit));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing thread local mode and address space from things to be copied here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

SmallVector<DIGlobalVariableExpression *, 5> GVs;
G.getDebugInfo(GVs);
for (auto md : GVs) {
NewGlobalVar->addDebugInfo(md);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure there are no other types of metadata you can copy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. now i invoke copyMetadata() function to copy all MD nodes.

@bader
Copy link
Contributor

bader commented Jul 23, 2020

Any correspondance between C++ types and similar types in the LLVM IR type system is nothing more than an implementation detail. There is no requirement for these types to be kept connected. The clang front end creates LLVM IR types as needed to perform a literal translation of the source code, which will describe the program semantics. The back end is then free to transform the IR in any way that maintains the same semantics. The front end probably does need to keep the AST-level type information until it generates IR, so we probably don't want to change the type there. With regard to pointer values once we generate LLVM IR, the following types are all exactly equivalent:

%class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd*
<16 x i32>*
<8 x i64>*
<64 x i8>*
i8*

They are all pointers, and one of the fundamental guiding principles of LLVM IR is that memory does not have a type.

If I understand it correctly GPU back-end will generate more efficient code if we replace %class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd* with <16 x i32>*. Is this aligned with the principles you mentioned?

As I said above, I do have concerns about the fact that we are relying on the types of pointers for certain transformations, but I'll put that aside for a moment and assume that perhaps the dependency isn't strictly as I've stated it and even if it is we can probably find another way to implement these transformations that removes the dependency.

Looking for C++ mangled type name %class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd* might lead to following issues:

  • C++ class name is defined not by the compiler, but in the SYCL library, which introduces hard dependency between the compiler and the library (i.e. we class name changes, compiler must be updated to use new name).
  • Clang's mangler scheme changes might break this transformation
  • This transformation might not apply to following SIMD class use cases:
    • User creates struct my_simd {simd s;}; and use my_simd.
    • User extends standard simd class functionality in derived class struct my_simd : simd {};
    • simd object captured into lambda object. Call operator will access it through this pointer.

There are a lot C++ semantics, which this pass might need to handle in order to enable efficient code generation.

@bader bader requested a review from andykaylor July 23, 2020 13:29
@kbobrovs
Copy link
Contributor Author

If I understand it correctly GPU back-end will generate more efficient code if we replace %class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd* with <16 x i32>*. Is this aligned with the principles you mentioned?

As I understand Andy, this equivalence exists only in some cases like formal parameter types. Instructions like GEP, alloca do need accurate type.
On the other hand, I think your proposal to make the type adjustment in the FE would be much easier to implement than start off the formal parameter usages (and handle pointer value data flow through possible casts, PHIs, etc)
The third option (maybe the easiest) would be to teach VC back-end to handle "wrapped" vectors, which *simd objects really are.
@pratikashar , @cmc-rep - what do you think?

Looking for C++ mangled type name %class._ZTSN2cm3gen4simdIiLi16EEE.cm::gen::simd*

This pass actually does not do that. It detects the simd object via its structural representation:

Type *ESIMDLowerVecArgPass::argIsSimdPtr(Value *arg) {
  auto ArgType = arg->getType();
  if (ArgType->isPointerTy()) {
    auto containedType = ArgType->getPointerElementType();
    if (containedType->isStructTy()) {
      if (containedType->getStructNumElements() == 1 &&
          containedType->getStructElementType(0)->isVectorTy()) {
        return PointerType::get(containedType->getStructElementType(0),
                                ArgType->getPointerAddressSpace());
      }
    }
  }
  return nullptr;
}

@pratikashar pratikashar force-pushed the esimd1-vecarg branch 2 times, most recently from 0aaf9be to f825c4d Compare July 23, 2020 16:22
@kbobrovs
Copy link
Contributor Author

@andykaylor, please take another look. I suggest that we commit this variant based on formal parameter type analysis, then file an issue recording your suggestion to re-implement this based on usage, plus listing couple other alternatives. Does this sound good?


$"_ZTSZZ4mainENK3$_0clERN2cl4sycl7handlerEE4Test" = comdat any

; CHECK: [[NEWGLOBAL:[@a-zA-Z0-9_]*]] = dso_local global {{.+}} align 64 #0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not checking the type and the initializer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

%cmp.i = icmp eq i64 %add.i.i.i.i.i, 0
%..i = select i1 %cmp.i, i64 %add.i5.i.i.i.i, i64 %add.i.i.i.i.i
%conv9.i = trunc i64 %..i to i32
; CHECK: store <16 x i32> {{.+}} bitcast (<16 x i32>* [[NEWGLOBAL]] to {{.+}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like some interesting things are missing from the check here. For instance, the original store contains an addressspace cast. What happens to that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated test to check for addrspacecasts

%1 = addrspacecast %"class._ZTSN2cl4sycl5intel3gpu4simdIiLi16EEE.cl::sycl::intel::gpu::simd"* %agg.tmp to %"class._ZTSN2cl4sycl5intel3gpu4simdIiLi16EEE.cl::sycl::intel::gpu::simd" addrspace(4)*
%M_data.i.i = getelementptr inbounds %"class._ZTSN2cl4sycl5intel3gpu4simdIiLi16EEE.cl::sycl::intel::gpu::simd", %"class._ZTSN2cl4sycl5intel3gpu4simdIiLi16EEE.cl::sycl::intel::gpu::simd" addrspace(4)* %0, i64 0, i32 0
%call.esimd.i.i = call <16 x i32> @llvm.genx.vload.v16i32.p4v16i32(<16 x i32> addrspace(4)* %M_data.i.i), !noalias !17
; CHECK: {{.+}} = call <16 x i32> {{.+}} [[NEWGLOBAL]] to {{.+}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'd be interested in the details here.

// F may have multiple arguments of type simd*. This
// function updates all parameters along with call
// call sites of F.
Function *ESIMDLowerVecArgPass::rewriteFunc(Function &F) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of details that need to be handled to do this correctly, so I think it would be more maintainable to use the existing clone functions.

// Use must be a call site
SmallVector<Value *, 10> Params;
auto User = use.getUser();
if (isa<CallInst>(User)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true.

auto User = use.getUser();
if (isa<CallInst>(User)) {
auto Call = cast<CallInst>(User);
for (unsigned int I = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could assert here that the function is not variadic then?

@andykaylor
Copy link
Contributor

@andykaylor, please take another look. I suggest that we commit this variant based on formal parameter type analysis, then file an issue recording your suggestion to re-implement this based on usage, plus listing couple other alternatives. Does this sound good?

Yes, I think that's OK. It sounds like I need to speak with the IGC developers about dependence on pointer types. It should be fine for the near future as it is.

bader
bader previously approved these changes Jul 28, 2020
@bader
Copy link
Contributor

bader commented Jul 28, 2020

@andykaylor, could you approve the PR, if you are okay to merge the updated version of this PR, please?

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Jul 28, 2020

@pratikashar, to speedup review, I addressed comments from @AlexeySachkov (in #2134) - please check if you are OK with the changes.

andykaylor
andykaylor previously approved these changes Jul 28, 2020
bader
bader previously approved these changes Jul 28, 2020
@AlexeySachkov
Copy link
Contributor

My comments from #2134 weren't critical, so, I'm okay to merge this patch as-is. I will most likely have some other questions after more careful review, but we can address those later

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs kbobrovs dismissed stale reviews from bader and andykaylor via a2002b3 July 28, 2020 17:57
@pvchupin pvchupin merged commit cc4845a into intel:sycl Jul 28, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Jul 30, 2020
…rogram

* upstream/sycl: (609 commits)
  [SYCL] Fix fail in the post commit testing (intel#2210)
  [SYCL] Materialize shadow local variables for byval arguments before use (intel#2200)
  [SYCL] Support lambda functions passed to reduction (intel#2190)
  [SYCL][USM] Improve USM Allocator. (intel#2026)
  [SYCL] Disallow mutable lambdas (intel#1785)
  [SYCL][ESIMD] Setup compilation pipeline for ESIMD (intel#2134)
  [SYCL] Fix not found kernel due to empty kernel name when using set_arg(s) (intel#2181)
  [SYCL] Fixed check for set_arg (intel#2203)
  Refactor indirect access calls to minimize invocations. (intel#2185)
  [SYCL][NFC] Fix potential null-pointer access (intel#2197)
  [SYCL] Propagate attributes from transitive calls to kernel (intel#1878)
  [SYCL] Fix warnings from static analysis tool (intel#2193)
  [SYCL][NFC] Fix ac_float test for compilation with FE optimizations (intel#2184)
  [GitHub Actions] Uplift clang-format version to 10 (intel#2194)
  [SYCL][ESIMD] Pass to replace simd* parameters with native llvm vectors. (intel#2097)
  [SYCL][NFC] Fixed SYCL_PI_TRACE output while selecting a device. (intel#2192)
  [SYCL][FPGA] New spec for controlling load-store units in FPGAs (intel#2158)
  [SYCL][Doc] Clarify reqd_sub_group_size (intel#2103)
  [SYCL] Remove noreturn function attribute (intel#2165)
  [SYCL] Aligned set_arg behaviour with SYCL specification (intel#2159)
  ...
@kbobrovs kbobrovs deleted the esimd1-vecarg branch July 30, 2020 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esimd Explicit SIMD feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants