Skip to content

[SLP][TTI]Improve detection of the insert-subvector pattern for SLP. #74749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 10, 2024

Conversation

alexey-bataev
Copy link
Member

SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TTI should support this
pattern to improve cost estimation of the insert_subvector shuffle
pattern.

@llvmbot
Copy link
Member

llvmbot commented Dec 7, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Alexey Bataev (alexey-bataev)

Changes

SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TTI should support this
pattern to improve cost estimation of the insert_subvector shuffle
pattern.


Full diff: https://github.com/llvm/llvm-project/pull/74749.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+8-3)
  • (modified) llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll (+6-11)
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index e05ce2890a08c..94e35e1529397 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -942,7 +942,7 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
 
   TTI::ShuffleKind improveShuffleKindFromMask(TTI::ShuffleKind Kind,
                                               ArrayRef<int> Mask,
-                                              VectorType *Ty, int &Index,
+                                              VectorType *&Ty, int &Index,
                                               VectorType *&SubTy) const {
     if (Mask.empty())
       return Kind;
@@ -963,8 +963,13 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
       int NumSubElts;
       if (Mask.size() > 2 && ShuffleVectorInst::isInsertSubvectorMask(
                                  Mask, NumSrcElts, NumSubElts, Index)) {
-        if (Index + NumSubElts > NumSrcElts)
-          return Kind;
+        if (Index + NumSubElts > NumSrcElts) {
+          if (Index + NumSrcElts > static_cast<int>(Mask.size()))
+            return Kind;
+          SubTy = FixedVectorType::get(Ty->getElementType(), Mask.size());
+          std::swap(Ty, SubTy);
+          return TTI::SK_InsertSubvector;
+        }
         SubTy = FixedVectorType::get(Ty->getElementType(), NumSubElts);
         return TTI::SK_InsertSubvector;
       }
diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
index 8e0f382222241..de1eecd98eeb3 100644
--- a/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
@@ -8,7 +8,7 @@
 ; YAML-NEXT:  Function:        test
 ; YAML-NEXT:  Args:
 ; YAML-NEXT:  - String:          'Stores SLP vectorized with cost '
-; YAML-NEXT:  - Cost:            '9'
+; YAML-NEXT:  - Cost:            '3'
 ; YAML-NEXT:  - String:          ' and with tree size '
 ; YAML-NEXT:  - TreeSize:        '7'
 
@@ -19,20 +19,15 @@ define void @test() {
 ; CHECK-NEXT:    [[TMP0:%.*]] = load float, ptr null, align 4
 ; CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr null, align 4
 ; CHECK-NEXT:    [[TMP2:%.*]] = load float, ptr null, align 4
-; CHECK-NEXT:    [[V9IDX:%.*]] = getelementptr i8, ptr null, i32 4
-; CHECK-NEXT:    [[V14IDX:%.*]] = getelementptr i8, ptr null, i32 8
 ; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <2 x float> <float poison, float 0.000000e+00>, float [[TMP1]], i32 0
 ; CHECK-NEXT:    [[TMP4:%.*]] = insertelement <2 x float> poison, float [[TMP0]], i32 0
 ; CHECK-NEXT:    [[TMP5:%.*]] = insertelement <2 x float> [[TMP4]], float [[TMP2]], i32 1
 ; CHECK-NEXT:    [[TMP6:%.*]] = fcmp ogt <2 x float> [[TMP3]], [[TMP5]]
-; CHECK-NEXT:    [[TMP7:%.*]] = extractelement <2 x i1> [[TMP6]], i32 0
-; CHECK-NEXT:    [[V0_0:%.*]] = select i1 [[TMP7]], float [[TMP0]], float 0.000000e+00
-; CHECK-NEXT:    [[TMP8:%.*]] = select <2 x i1> [[TMP6]], <2 x float> [[TMP3]], <2 x float> zeroinitializer
-; CHECK-NEXT:    [[TMP9:%.*]] = extractelement <2 x i1> [[TMP6]], i32 1
-; CHECK-NEXT:    [[V9_0:%.*]] = select i1 [[TMP9]], float [[TMP2]], float 0.000000e+00
-; CHECK-NEXT:    store float [[V0_0]], ptr null, align 4
-; CHECK-NEXT:    store float [[V9_0]], ptr [[V9IDX]], align 4
-; CHECK-NEXT:    store <2 x float> [[TMP8]], ptr [[V14IDX]], align 4
+; CHECK-NEXT:    [[TMP7:%.*]] = shufflevector <2 x i1> [[TMP6]], <2 x i1> poison, <4 x i32> <i32 0, i32 1, i32 0, i32 1>
+; CHECK-NEXT:    [[TMP8:%.*]] = shufflevector <2 x float> [[TMP5]], <2 x float> [[TMP3]], <4 x i32> <i32 0, i32 1, i32 2, i32 poison>
+; CHECK-NEXT:    [[TMP9:%.*]] = shufflevector <4 x float> [[TMP8]], <4 x float> <float poison, float poison, float poison, float 0.000000e+00>, <4 x i32> <i32 0, i32 1, i32 2, i32 7>
+; CHECK-NEXT:    [[TMP10:%.*]] = select <4 x i1> [[TMP7]], <4 x float> [[TMP9]], <4 x float> zeroinitializer
+; CHECK-NEXT:    store <4 x float> [[TMP10]], ptr null, align 4
 ; CHECK-NEXT:    ret void
 ;
 entry:

@RKSimon
Copy link
Collaborator

RKSimon commented Dec 10, 2023

Could we not improve ShuffleVectorInst::isInsertSubvectorMask directly instead?

@alexey-bataev
Copy link
Member Author

Could we not improve ShuffleVectorInst::isInsertSubvectorMask directly instead?

Sorry, did not quite understand. You mean do the changes in ShuffleVectorInst::isInsertSubvectorMask? Unfortunately, no. Currently improveShuffleKindFromMask just does not provide correct type/subtype at all, so even changing ShuffleVectorInst::isInsertSubvectorMask won't help.

@alexey-bataev alexey-bataev force-pushed the TTIImproveInsertSubvector branch from 334c9cf to 4f37505 Compare December 19, 2023 18:22
@alexey-bataev alexey-bataev force-pushed the TTIImproveInsertSubvector branch from 4f37505 to 857d4a2 Compare January 3, 2024 19:13
@alexey-bataev
Copy link
Member Author

Ping!

@alexey-bataev alexey-bataev force-pushed the TTIImproveInsertSubvector branch from 857d4a2 to 4c79e4a Compare January 5, 2024 01:23
@alexey-bataev
Copy link
Member Author

Ping!

SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TTI should support this
pattern to improve cost estimation of the insert_subvector shuffle
pattern.
@alexey-bataev
Copy link
Member Author

Moved detection to SLP vectorizer.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alexey-bataev alexey-bataev merged commit 73ce13d into llvm:main Jan 10, 2024
@alexey-bataev alexey-bataev deleted the TTIImproveInsertSubvector branch January 10, 2024 15:39
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…lvm#74749)

SLP vectorizer passes the type of the subvector and the mask, which size
determines the size of the resulting vector. TTI should support this
pattern to improve cost estimation of the insert_subvector shuffle
pattern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants