-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. #74749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. #74749
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Alexey Bataev (alexey-bataev) ChangesSLP vectorizer passes the type of the subvector and the mask, which size Full diff: https://github.com/llvm/llvm-project/pull/74749.diff 2 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index e05ce2890a08c..94e35e1529397 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -942,7 +942,7 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
TTI::ShuffleKind improveShuffleKindFromMask(TTI::ShuffleKind Kind,
ArrayRef<int> Mask,
- VectorType *Ty, int &Index,
+ VectorType *&Ty, int &Index,
VectorType *&SubTy) const {
if (Mask.empty())
return Kind;
@@ -963,8 +963,13 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
int NumSubElts;
if (Mask.size() > 2 && ShuffleVectorInst::isInsertSubvectorMask(
Mask, NumSrcElts, NumSubElts, Index)) {
- if (Index + NumSubElts > NumSrcElts)
- return Kind;
+ if (Index + NumSubElts > NumSrcElts) {
+ if (Index + NumSrcElts > static_cast<int>(Mask.size()))
+ return Kind;
+ SubTy = FixedVectorType::get(Ty->getElementType(), Mask.size());
+ std::swap(Ty, SubTy);
+ return TTI::SK_InsertSubvector;
+ }
SubTy = FixedVectorType::get(Ty->getElementType(), NumSubElts);
return TTI::SK_InsertSubvector;
}
diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
index 8e0f382222241..de1eecd98eeb3 100644
--- a/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
@@ -8,7 +8,7 @@
; YAML-NEXT: Function: test
; YAML-NEXT: Args:
; YAML-NEXT: - String: 'Stores SLP vectorized with cost '
-; YAML-NEXT: - Cost: '9'
+; YAML-NEXT: - Cost: '3'
; YAML-NEXT: - String: ' and with tree size '
; YAML-NEXT: - TreeSize: '7'
@@ -19,20 +19,15 @@ define void @test() {
; CHECK-NEXT: [[TMP0:%.*]] = load float, ptr null, align 4
; CHECK-NEXT: [[TMP1:%.*]] = load float, ptr null, align 4
; CHECK-NEXT: [[TMP2:%.*]] = load float, ptr null, align 4
-; CHECK-NEXT: [[V9IDX:%.*]] = getelementptr i8, ptr null, i32 4
-; CHECK-NEXT: [[V14IDX:%.*]] = getelementptr i8, ptr null, i32 8
; CHECK-NEXT: [[TMP3:%.*]] = insertelement <2 x float> <float poison, float 0.000000e+00>, float [[TMP1]], i32 0
; CHECK-NEXT: [[TMP4:%.*]] = insertelement <2 x float> poison, float [[TMP0]], i32 0
; CHECK-NEXT: [[TMP5:%.*]] = insertelement <2 x float> [[TMP4]], float [[TMP2]], i32 1
; CHECK-NEXT: [[TMP6:%.*]] = fcmp ogt <2 x float> [[TMP3]], [[TMP5]]
-; CHECK-NEXT: [[TMP7:%.*]] = extractelement <2 x i1> [[TMP6]], i32 0
-; CHECK-NEXT: [[V0_0:%.*]] = select i1 [[TMP7]], float [[TMP0]], float 0.000000e+00
-; CHECK-NEXT: [[TMP8:%.*]] = select <2 x i1> [[TMP6]], <2 x float> [[TMP3]], <2 x float> zeroinitializer
-; CHECK-NEXT: [[TMP9:%.*]] = extractelement <2 x i1> [[TMP6]], i32 1
-; CHECK-NEXT: [[V9_0:%.*]] = select i1 [[TMP9]], float [[TMP2]], float 0.000000e+00
-; CHECK-NEXT: store float [[V0_0]], ptr null, align 4
-; CHECK-NEXT: store float [[V9_0]], ptr [[V9IDX]], align 4
-; CHECK-NEXT: store <2 x float> [[TMP8]], ptr [[V14IDX]], align 4
+; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <2 x i1> [[TMP6]], <2 x i1> poison, <4 x i32> <i32 0, i32 1, i32 0, i32 1>
+; CHECK-NEXT: [[TMP8:%.*]] = shufflevector <2 x float> [[TMP5]], <2 x float> [[TMP3]], <4 x i32> <i32 0, i32 1, i32 2, i32 poison>
+; CHECK-NEXT: [[TMP9:%.*]] = shufflevector <4 x float> [[TMP8]], <4 x float> <float poison, float poison, float poison, float 0.000000e+00>, <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+; CHECK-NEXT: [[TMP10:%.*]] = select <4 x i1> [[TMP7]], <4 x float> [[TMP9]], <4 x float> zeroinitializer
+; CHECK-NEXT: store <4 x float> [[TMP10]], ptr null, align 4
; CHECK-NEXT: ret void
;
entry:
|
Could we not improve ShuffleVectorInst::isInsertSubvectorMask directly instead? |
Sorry, did not quite understand. You mean do the changes in ShuffleVectorInst::isInsertSubvectorMask? Unfortunately, no. Currently improveShuffleKindFromMask just does not provide correct type/subtype at all, so even changing ShuffleVectorInst::isInsertSubvectorMask won't help. |
334c9cf
to
4f37505
Compare
4f37505
to
857d4a2
Compare
Ping! |
857d4a2
to
4c79e4a
Compare
Ping! |
SLP vectorizer passes the type of the subvector and the mask, which size determines the size of the resulting vector. TTI should support this pattern to improve cost estimation of the insert_subvector shuffle pattern.
4c79e4a
to
54d38be
Compare
Moved detection to SLP vectorizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…lvm#74749) SLP vectorizer passes the type of the subvector and the mask, which size determines the size of the resulting vector. TTI should support this pattern to improve cost estimation of the insert_subvector shuffle pattern.
SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TTI should support this
pattern to improve cost estimation of the insert_subvector shuffle
pattern.