-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV] Update MicroOpBufferSize in P400 & P600 scheduling models #128786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Update MicroOpBufferSize in P400 & P600 scheduling models #128786
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Min-Yih Hsu (mshockwave) ChangesThe numbers we previously picked for MicroOpBufferSize in both P400 and P600's scheduling models turned out to be too conservative and didn't properly reflect the characteristics of our microarchitectures. This patch updates these numbers to be more faithful to our hardware. This is unlikely to have any significant impact on MachineScheduler as it only uses MicroOpBufferSize in few places. That said, it is supposed to improve the accuracy of llvm-mca. Full diff: https://github.com/llvm/llvm-project/pull/128786.diff 5 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td
index 396cbe2c476c6..e7f8f88e3909f 100644
--- a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td
+++ b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP400.td
@@ -119,7 +119,7 @@ class SiFiveP400VSM3CCycles<string mx> {
def SiFiveP400Model : SchedMachineModel {
let IssueWidth = 3; // 3 micro-ops are dispatched per cycle.
- let MicroOpBufferSize = 56; // Max micro-ops that can be buffered.
+ let MicroOpBufferSize = 96; // Max micro-ops that can be buffered.
let LoadLatency = 4; // Cycles for loads to access the cache.
let MispredictPenalty = 9; // Extra cycles for a mispredicted branch.
let UnsupportedFeatures = [HasStdExtZbkb, HasStdExtZbkc, HasStdExtZbkx,
diff --git a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td
index 0c695c9ef3071..60d41b02f0e8a 100644
--- a/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td
+++ b/llvm/lib/Target/RISCV/RISCVSchedSiFiveP600.td
@@ -286,10 +286,10 @@ class SiFiveP600VSHA2MSCycles<string mx, int sew> {
// SiFiveP600 machine model for scheduling and other instruction cost heuristics.
def SiFiveP600Model : SchedMachineModel {
- let IssueWidth = 4; // 4 micro-ops are dispatched per cycle.
- let MicroOpBufferSize = 160; // Max micro-ops that can be buffered.
- let LoadLatency = 4; // Cycles for loads to access the cache.
- let MispredictPenalty = 9; // Extra cycles for a mispredicted branch.
+ let IssueWidth = 4; // 4 micro-ops are dispatched per cycle.
+ let MicroOpBufferSize = 192; // Max micro-ops that can be buffered.
+ let LoadLatency = 4; // Cycles for loads to access the cache.
+ let MispredictPenalty = 9; // Extra cycles for a mispredicted branch.
let UnsupportedFeatures = [HasStdExtZbkb, HasStdExtZbkc, HasStdExtZbkx,
HasStdExtZknd, HasStdExtZkne, HasStdExtZknh,
HasStdExtZksed, HasStdExtZksh, HasStdExtZkr,
diff --git a/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/div.s b/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/div.s
index c42b4a9ef4ac4..311310bc95982 100644
--- a/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/div.s
+++ b/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/div.s
@@ -328,12 +328,12 @@ vfsqrt.v v8, v16
# CHECK: Iterations: 1
# CHECK-NEXT: Instructions: 320
-# CHECK-NEXT: Total Cycles: 22358
+# CHECK-NEXT: Total Cycles: 19388
# CHECK-NEXT: Total uOps: 320
# CHECK: Dispatch Width: 3
-# CHECK-NEXT: uOps Per Cycle: 0.01
-# CHECK-NEXT: IPC: 0.01
+# CHECK-NEXT: uOps Per Cycle: 0.02
+# CHECK-NEXT: IPC: 0.02
# CHECK-NEXT: Block RThroughput: 14361.0
# CHECK: Instruction Info:
diff --git a/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/vlseg-vsseg.s b/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/vlseg-vsseg.s
index 9ba461acef0e3..a3f14f316e874 100644
--- a/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/vlseg-vsseg.s
+++ b/llvm/test/tools/llvm-mca/RISCV/SiFiveP400/vlseg-vsseg.s
@@ -1606,7 +1606,7 @@ vsoxseg8ei64.v v8, (a0), v16
# CHECK: Iterations: 1
# CHECK-NEXT: Instructions: 1540
-# CHECK-NEXT: Total Cycles: 29967
+# CHECK-NEXT: Total Cycles: 28335
# CHECK-NEXT: Total uOps: 1540
# CHECK: Dispatch Width: 3
diff --git a/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/div.s b/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/div.s
index 0b7cc95dcb8d6..0d14a0f734bdc 100644
--- a/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/div.s
+++ b/llvm/test/tools/llvm-mca/RISCV/SiFiveP600/div.s
@@ -328,7 +328,7 @@ vfsqrt.v v8, v16
# CHECK: Iterations: 1
# CHECK-NEXT: Instructions: 320
-# CHECK-NEXT: Total Cycles: 14613
+# CHECK-NEXT: Total Cycles: 14397
# CHECK-NEXT: Total uOps: 320
# CHECK: Dispatch Width: 4
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The numbers we previously picked for MicroOpBufferSize in both P400 and P600's scheduling models turned out to be too conservative and didn't properly reflect the characteristics of our microarchitectures. This patch updates these numbers to be more faithful to our hardware.
This is unlikely to have any significant impact on MachineScheduler as it only uses MicroOpBufferSize in few places. That said, it is supposed to improve the accuracy of llvm-mca.