-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[InstCombine] Fold A pred C ? (A >> BW - 1) : 1 --> ZExt(A pred C ? A < 0 : 1) #69961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms Author: None (elhewaty) Changes
Full diff: https://github.com/llvm/llvm-project/pull/69961.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 7a15c0dee492b5a..4fd3f4f594d7a77 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -3415,6 +3415,20 @@ Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {
TrueVal);
}
+ // select (icmp eq a, 0), 1, (lshr a, 31) -> icmp sle a, 0,
+ // which is then converted to icmp sle a, 1
+ CmpInst::Predicate Pred;
+ Value *A;
+ const APInt *C;
+ if (match(CondVal, m_Cmp(Pred, m_Value(A), m_Zero())) &&
+ match(TrueVal, m_One()) &&
+ match(FalseVal, m_LShr(m_Specific(A), m_APInt(C))) &&
+ Pred == ICmpInst::ICMP_EQ && *C == 31) {
+ auto *Cond = Builder.CreateICmpSLE(A,
+ ConstantInt::getNullValue(A->getType()));
+ return new ZExtInst(Cond, A->getType());
+ }
+
if (Instruction *R = foldSelectOfBools(SI))
return R;
diff --git a/llvm/test/Transforms/InstCombine/icmp-select.ll b/llvm/test/Transforms/InstCombine/icmp-select.ll
index 0d723c9df32e2f4..7fe37200788dc5e 100644
--- a/llvm/test/Transforms/InstCombine/icmp-select.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-select.ll
@@ -5,6 +5,18 @@ declare void @use(i8)
declare void @use.i1(i1)
declare i8 @llvm.umin.i8(i8, i8)
+define i32 @test_icmp_select_lte_0(i32 %0) {
+; CHECK-LABEL: @test_icmp_select_lte_0(
+; CHECK-NEXT: [[TMP2:%.*]] = icmp slt i32 [[TMP0:%.*]], 1
+; CHECK-NEXT: [[RE:%.*]] = zext i1 [[TMP2]] to i32
+; CHECK-NEXT: ret i32 [[RE]]
+;
+ %cml = icmp eq i32 %0, 0
+ %lshr = lshr i32 %0, 31
+ %re = select i1 %cml, i32 1, i32 %lshr
+ ret i32 %re
+}
+
define i1 @icmp_select_const(i8 %x, i8 %y) {
; CHECK-LABEL: @icmp_select_const(
; CHECK-NEXT: [[CMP1:%.*]] = icmp eq i8 [[X:%.*]], 0
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
if (match(CondVal, m_Cmp(Pred, m_Value(A), m_Zero())) && | ||
match(TrueVal, m_One()) && | ||
match(FalseVal, m_LShr(m_Specific(A), m_APInt(C))) && | ||
Pred == ICmpInst::ICMP_EQ && *C == 31) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded assumption of i32 instead of bit width - 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests with different sized integers and vectors
Similar patterns like |
@XChy Can you explain further? |
For example, However, I found that InstCombine has handled such pattern, just missing here for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @XChy on how to handle this. That will enable more combinations of predicates to fold.
This is something of a recurring pattern, see https://reviews.llvm.org/D154791 and #68244 for related patches.
@XChy Should first transform |
Just |
Then the optimization in this patch is handled somewhere? |
Yes. What we need to do is just making |
I mean Is there a patch already for this? |
Things like Just fold |
Yes, It makes, thanks. Will this |
Yes, godbolt. |
@XChy What is wrong with this approach?
|
m_Cmp -> m_ICmp You may need to refer to https://reviews.llvm.org/D154791 and #68244. |
And a ZExt instruction should be emitted here, surrounding the whole select. |
@XChy I tried the following approach.
but I got this,
|
Don't reuse SI when replacing it with new instruction, which causes self-reference of instruction. |
What's wrong with this?
|
@elhewaty I think the type of your |
e25e74f
to
1ba6998
Compare
@dtcxzyw, Can you please review this, too? |
Could you please rebase this patch first? I cannot apply it to my local repo:( |
1ba6998
to
146d548
Compare
@dtcxzyw, done |
Probably no, so I won't insist on this pattern if there is no particular benefit. I remember clang was sometimes suboptimal for expressions which contain By the way, shouldn't we at least change the title of this PR? Now we're dealing with more general pattern. |
@@ -3409,6 +3409,25 @@ Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) { | |||
return replaceOperand(SI, 2, S); | |||
} | |||
|
|||
{ | |||
// A pred C ? (A >> BW - 1) : 1 --> ZExt(A pred C ? A < 0 : 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be generalized to Cond ? (A >> (BW - 1)) : 1 --> ZExt(Cond ? A < 0 : 1)
.
I am sorry I cannot give the approval for this patch if you cannot demonstrate that it benefits some real-world applications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT the idea is that this is useful if A pred C
implies something about A < 0
in which case it makes sense to fold A >> BW - 1
-> the more easy to reason about A < 0
.
At least thats what it looks like the purpose is from the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goldsteinn, Should I fold A < 0
instead of A Pred C
? or draft the patch until we can demonstrate that it benefits some real-world applications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No real comment about the real-world impact. I'm generally pretty in favor of "get anything we can in" but think thats a minority opinion.
I would refactor to make the goal explicit. Use makeExactICmpRegion
on the A Pred C
and just do the complete simplification if the range simplifies A < 0
to a constant.
Alive2: https://alive2.llvm.org/ce/z/-T9Gwa
Fixes: [InstCombine] a ? (a < 0) : 1 --> (a <= 0) #68227