Fix InitListExpr for OpenCL vectors #615

seven-mile · 2024-05-17T06:05:56Z

Current CIRGen may emit %vi4res = cir.vec.create(..., %vi2a, %vi2b) for the source OpenCL code vi4 vi4res = (vi4)(vi2a, vi2b), and end up with "inserting elements typed vi2 into a vector typed vi4 in LLVM IR".

The corresponding implementation from OG CodeGen is here. It uses shuffle operations to extend two vectors and merge the effective elements into the final result.

We can make it CIRGen or Lowering (keep the cir.vec.create(%vi2a, %vi2b) in CIR, rather than emitting shuffles immediately). I prefer CIRGen still.

Related to PR #613 . Suggested test case:

typedef int vi4 __attribute__((ext_vector_type(4)));
typedef int vi2 __attribute__((ext_vector_type(2)));

__kernel void func(void) {
  vi2 a = {1, 2}, b = {3, 4};
  
  vi4 res = (vi4)(a, b);

  a = (vi2){ res.xy };
}

The text was updated successfully, but these errors were encountered:

bcardosolopes · 2024-05-20T23:23:41Z

Current CIRGen may emit %vi4res = cir.vec.create(..., %vi2a, %vi2b) for the source OpenCL code vi4 vi4res = (vi4)(vi2a, vi2b), and end up with "inserting elements typed vi2 into a vector typed vi4 in LLVM IR".

I don't remember offhand. Does this seems like something done by design (i.e. we already have testcases for this) or is it something we forgot to verify?

Looking at VecCreateOp::verify my impression is that this isn't supported, didn't you get verification errors?

The corresponding implementation from OG CodeGen is here. It uses shuffle operations to extend two vectors and merge the effective elements into the final result.

We can make it CIRGen or Lowering (keep the cir.vec.create(%vi2a, %vi2b) in CIR, rather than emitting shuffles immediately). I prefer CIRGen still.

Whatever we decide to do on CIRGen, we need to make sure that the corresponding LLVM lowering should match what OG codegen does (in this case it shall be series of shuffles). However, if we could do better in CIRGen to map the semantics in a more clear way, we should do it - if we emit shuffles in CIRGen we make it potentially harder to retrieve original information, because we need to look into the shuffle and recognize it's just joining two smaller vectores.

I'd prefer avoiding shuffles this early for this, but if it's something we are already doing, then it wouldn't be inconsistent (and we can later improve by adding other ops). I'd also be fine with improving cir.vec.create to support the "building from smaller vectors" scenary. Another option would be to introduce operations for extending number of lanes and use that result to build the vectors, but not sure how well that feds into cir.vec.create later.

@dkolsen-pgi, suggestions on what do you think might play better here?

dkolsen-pgi · 2024-05-20T23:46:16Z

GNU vectors do not support concatenating two vectors with the syntax:

vi4 res = (vi4)(a, b);

So I haven't implemented that in CIR.

I think this is best implemented with cir.vec.shuffle rather than cir.vec.create. Concatenating two vectors is one of the things that shufflevector is designed to do.

bcardosolopes · 2024-05-20T23:58:40Z

Works for me, though a concat op would be cool too, but perhaps we could wait until we actually have a pass that'd prefer saving some compile time by not having to look at the mask to reconstruct the concat.

Note you'd still need an operation to extend these vectors before passing them to a shuffle as input. We could probably use some form of cast for that.

dkolsen-pgi · 2024-05-21T00:11:27Z

Note you'd still need an operation to extend these vectors before passing them to a shuffle as input.

That's not necessary. The result vector can have a different size than the two input vectors.

seven-mile added the bug Something isn't working label Jun 7, 2024

seven-mile mentioned this issue Jun 15, 2024

[GSoC] Add OpenCL support to compile GPU kernels #689

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix InitListExpr for OpenCL vectors #615

Fix InitListExpr for OpenCL vectors #615

seven-mile commented May 17, 2024

bcardosolopes commented May 20, 2024 •

edited

Loading

dkolsen-pgi commented May 20, 2024

bcardosolopes commented May 20, 2024

dkolsen-pgi commented May 21, 2024

Fix InitListExpr for OpenCL vectors #615

Fix InitListExpr for OpenCL vectors #615

Comments

seven-mile commented May 17, 2024

bcardosolopes commented May 20, 2024 • edited Loading

dkolsen-pgi commented May 20, 2024

bcardosolopes commented May 20, 2024

dkolsen-pgi commented May 21, 2024

bcardosolopes commented May 20, 2024 •

edited

Loading