-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AMDGPU] Add sext_trunc in RegBankCombiner #131623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-amdgpu Author: Pierre van Houtryve (Pierre-vh) ChangesFull diff: https://github.com/llvm/llvm-project/pull/131623.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
index a21505356274b..083ce48911689 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
@@ -181,5 +181,5 @@ def AMDGPURegBankCombiner : GICombiner<
zext_trunc_fold, int_minmax_to_med3, ptr_add_immed_chain,
fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp,
identity_combines, redundant_and, constant_fold_cast_op,
- cast_of_cast_combines]> {
+ cast_of_cast_combines, sext_trunc]> {
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test changes?
3f2cbbd
to
4feac2f
Compare
Test changes were in the previous diff in the stack, it should be fixed now. |
Hmm, I still can't see the test change? |
Ah, this doesn't do anything at this stage. It's only helpful once we disable widening of i16 ops to i32 in CGP. Then this pattern can appear and it'll fold it. This combine is tested in AArch64. Should I copy over a few simple test cases in the AMDGPU folder just to show the combine works in RegBankCombiner? |
f95a7c0
to
e717745
Compare
@@ -1,5 +1,6 @@ | |||
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py | |||
# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=amdgpu-prelegalizer-combiner -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s | |||
# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=amdgpu-prelegalizer-combiner -verify-machineinstrs %s -o - | FileCheck -check-prefixes=GCN,PRELEGAL %s | |||
# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=amdgpu-regbank-combiner -verify-machineinstrs %s -o - | FileCheck -check-prefixes=GCN,RBCOMB %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arsenm Is it a good idea here to test two passes in one test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It helps to capture the lowering at significant points during the codegen that are important to test in certain scenarios.
Merge activity
|
dd4dbe6
to
ca9b7c3
Compare
ca9b7c3
to
2f2f6b7
Compare
No description provided.