-
Notifications
You must be signed in to change notification settings - Fork 146
Fix InitListExpr for OpenCL vectors #615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't remember offhand. Does this seems like something done by design (i.e. we already have testcases for this) or is it something we forgot to verify? Looking at
Whatever we decide to do on CIRGen, we need to make sure that the corresponding LLVM lowering should match what OG codegen does (in this case it shall be series of shuffles). However, if we could do better in CIRGen to map the semantics in a more clear way, we should do it - if we emit shuffles in CIRGen we make it potentially harder to retrieve original information, because we need to look into the shuffle and recognize it's just joining two smaller vectores. I'd prefer avoiding shuffles this early for this, but if it's something we are already doing, then it wouldn't be inconsistent (and we can later improve by adding other ops). I'd also be fine with improving @dkolsen-pgi, suggestions on what do you think might play better here? |
GNU vectors do not support concatenating two vectors with the syntax:
So I haven't implemented that in CIR. I think this is best implemented with |
Works for me, though a concat op would be cool too, but perhaps we could wait until we actually have a pass that'd prefer saving some compile time by not having to look at the mask to reconstruct the concat. Note you'd still need an operation to extend these vectors before passing them to a shuffle as input. We could probably use some form of cast for that. |
That's not necessary. The result vector can have a different size than the two input vectors. |
Current CIRGen may emit
%vi4res = cir.vec.create(..., %vi2a, %vi2b)
for the source OpenCL codevi4 vi4res = (vi4)(vi2a, vi2b)
, and end up with "inserting elements typedvi2
into a vector typedvi4
in LLVM IR".The corresponding implementation from OG CodeGen is here. It uses shuffle operations to extend two vectors and merge the effective elements into the final result.
We can make it CIRGen or Lowering (keep the
cir.vec.create(%vi2a, %vi2b)
in CIR, rather than emitting shuffles immediately). I prefer CIRGen still.Related to PR #613 . Suggested test case:
The text was updated successfully, but these errors were encountered: