-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[ppc] Reference to a constant pool in simple functions shouldn't spill LR reg to the stack #959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
I moved the contents of this PR to the PPC readme where it is more visible to those interested in PPC work. |
keryell
pushed a commit
to keryell/llvm-project
that referenced
this issue
Oct 19, 2024
…nOp (llvm#959) They should use PoisonOp (which becomes PoisonValue in LLVMIR) as it is the OG's choice. Proof: We generate VecCreateOp [here ](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp#L1975) And it's OG counterpart is [here](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/clang/lib/CodeGen/CGExprScalar.cpp#L2096) OG uses PoisonValue. As to VecSplatOp, OG unconditionally [chooses PoisonValue ](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/llvm/lib/IR/IRBuilder.cpp#L1204) A even more solid proof for this case is that when we use OG to generate code for our test case I changed in this PR , its always using poison instead of undef as far as VecSplat and VecCreate is concerned. The [OG generated code for vectype-ext.cpp ](https://godbolt.org/z/eqx1rns86) here. The [OG generated code for vectype.cpp ](https://godbolt.org/z/frMjbKGeT) here. For reference, generated CIR for the test case vectype-ext.cpp is [here](https://godbolt.org/z/frMjbKGeT) This is to unblock llvm/clangir#936 to help it set on the right path. Note: There might be other CIR vec ops that need to choose Poison to be consistent with OG, but I'd limit the scope of this PR, and wait to see issue pop up in the future.
xlauko
pushed a commit
to trailofbits/instafix-llvm
that referenced
this issue
Mar 28, 2025
…nOp (llvm#959) They should use PoisonOp (which becomes PoisonValue in LLVMIR) as it is the OG's choice. Proof: We generate VecCreateOp [here ](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp#L1975) And it's OG counterpart is [here](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/clang/lib/CodeGen/CGExprScalar.cpp#L2096) OG uses PoisonValue. As to VecSplatOp, OG unconditionally [chooses PoisonValue ](https://github.com/llvm/clangir/blob/2ca12fe5ec3a1e7279256f069010be2d68200585/llvm/lib/IR/IRBuilder.cpp#L1204) A even more solid proof for this case is that when we use OG to generate code for our test case I changed in this PR , its always using poison instead of undef as far as VecSplat and VecCreate is concerned. The [OG generated code for vectype-ext.cpp ](https://godbolt.org/z/eqx1rns86) here. The [OG generated code for vectype.cpp ](https://godbolt.org/z/frMjbKGeT) here. For reference, generated CIR for the test case vectype-ext.cpp is [here](https://godbolt.org/z/frMjbKGeT) This is to unblock llvm/clangir#936 to help it set on the right path. Note: There might be other CIR vec ops that need to choose Poison to be consistent with OG, but I'd limit the scope of this PR, and wait to see issue pop up in the future.
This issue was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Extended Description
Consider a function like this:
float foo(float X) { return X + 1234.4123f; }
The FP constant ends up in the constant pool, so we need to get the LR register.
This ends up producing code like this:
_foo:
.LBB_foo_0: ; entry
mflr r11
*** stw r11, 8(r1)
bl "L00000$pb"
"L00000$pb":
mflr r2
addis r2, r2, ha16(.CPI_foo_0-"L00000$pb")
lfs f0, lo16(.CPI_foo_0-"L00000$pb")(r2)
fadds f1, f1, f0
*** lwz r11, 8(r1)
mtlr r11
blr
This is functional, but there is no reason to spill the LR register all the way
to the stack (the two marked instrs): spilling it to a GPR is quite enough.
Implementing this will require some codegen improvements. Nate writes:
"So basically what we need to support the "no stack frame save and restore" is a
generalization of the LR optimization to "callee-save regs".
Currently, we have LR marked as a callee-save reg. The register allocator sees
that it's callee save, and spills it directly to the stack.
Ideally, something like this would happen:
LR would be in a separate register class from the GPRs. The class of LR would be
marked "unspillable". When the register allocator came across an unspillable
reg, it would ask "what is the best class to copy this into that I can spill"
If it gets a class back, which it will in this case (the gprs), it grabs a free
register of that class. If it is then later necessary to spill that reg, so be it.
"
This makes an incredible amount of sense to me. :)
-Chris
The text was updated successfully, but these errors were encountered: