Skip to content

[ScheduleDAG] Allow disabling the SchedModel / Itineraries during Scheduling #138057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 5, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion llvm/include/llvm/CodeGen/TargetSchedule.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@ class TargetSchedModel {

unsigned computeInstrLatency(const MCSchedClassDesc &SCDesc) const;

bool EnableSchedModel = true;
bool EnableSchedItins = true;
Comment on lines +55 to +56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document these, maybe should just make it an enum for which type to use


public:
TargetSchedModel() : SchedModel(MCSchedModel::Default) {}

Expand All @@ -53,7 +56,8 @@ class TargetSchedModel {
/// The machine model API keeps a copy of the top-level MCSchedModel table
/// indices and may query TargetSubtargetInfo and TargetInstrInfo to resolve
/// dynamic properties.
void init(const TargetSubtargetInfo *TSInfo);
void init(const TargetSubtargetInfo *TSInfo, bool EnableSModel = true,
bool EnableSItins = true);

/// Return the MCSchedClassDesc for this instruction.
const MCSchedClassDesc *resolveSchedClass(const MachineInstr *MI) const;
Expand Down
10 changes: 9 additions & 1 deletion llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,14 @@ static cl::opt<bool>
static cl::opt<bool> UseTBAA("use-tbaa-in-sched-mi", cl::Hidden,
cl::init(true), cl::desc("Enable use of TBAA during MI DAG construction"));

static cl::opt<bool>
EnableSchedModel("schedmodel", cl::Hidden, cl::init(true),
cl::desc("Use TargetSchedModel for latency lookup"));

static cl::opt<bool>
EnableSchedItins("scheditins", cl::Hidden, cl::init(true),
cl::desc("Use InstrItineraryData for latency lookup"));
Comment on lines +72 to +78
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are mutually exclusive though? What happens if you set both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell -- I don't see this mutual exclusion constraint encoded. It seems like the API handles the case of having neither --

Default if both are missing --

if (!hasInstrSchedModel() && !hasInstrItineraries())

Default if both are missing --

return TII->defaultDefLatency(SchedModel, *MI);


// Note: the two options below might be used in tuning compile time vs
// output quality. Setting HugeRegion so large that it will never be
// reached means best-effort, but may be slow.
Expand Down Expand Up @@ -121,7 +129,7 @@ ScheduleDAGInstrs::ScheduleDAGInstrs(MachineFunction &mf,
DbgValues.clear();

const TargetSubtargetInfo &ST = mf.getSubtarget();
SchedModel.init(&ST);
SchedModel.init(&ST, EnableSchedModel, EnableSchedItins);
}

/// If this machine instr has memory reference information and it can be
Expand Down
12 changes: 5 additions & 7 deletions llvm/lib/CodeGen/TargetSchedule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,6 @@

using namespace llvm;

static cl::opt<bool> EnableSchedModel("schedmodel", cl::Hidden, cl::init(true),
cl::desc("Use TargetSchedModel for latency lookup"));

static cl::opt<bool> EnableSchedItins("scheditins", cl::Hidden, cl::init(true),
cl::desc("Use InstrItineraryData for latency lookup"));

static cl::opt<bool> ForceEnableIntervals(
"sched-model-force-enable-intervals", cl::Hidden, cl::init(false),
cl::desc("Force the use of resource intervals in the schedule model"));
Expand All @@ -47,12 +41,16 @@ bool TargetSchedModel::hasInstrItineraries() const {
return EnableSchedItins && !InstrItins.isEmpty();
}

void TargetSchedModel::init(const TargetSubtargetInfo *TSInfo) {
void TargetSchedModel::init(const TargetSubtargetInfo *TSInfo,
bool EnableSModel, bool EnableSItins) {
STI = TSInfo;
SchedModel = TSInfo->getSchedModel();
TII = TSInfo->getInstrInfo();
STI->initInstrItins(InstrItins);

EnableSchedModel = EnableSModel;
EnableSchedItins = EnableSItins;

unsigned NumRes = SchedModel.getNumProcResourceKinds();
ResourceFactors.resize(NumRes);
ResourceLCM = SchedModel.IssueWidth;
Expand Down
1 change: 1 addition & 0 deletions llvm/test/CodeGen/AMDGPU/mai-hazards-gfx942.mir
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# RUN: llc -mtriple=amdgcn -mcpu=gfx942 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefixes=GCN,GFX942 %s
# RUN: llc -mtriple=amdgcn -mcpu=gfx950 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefixes=GCN,GFX950 %s
# RUN: llc -mtriple=amdgcn -mcpu=gfx950 -verify-machineinstrs -run-pass post-RA-hazard-rec --schedmodel=0 %s -o - | FileCheck -check-prefixes=GCN,GFX950 %s

# GCN-LABEL: name: valu_write_vgpr_sgemm_mfma_read
# GCN: V_MOV_B32
Expand Down
50 changes: 50 additions & 0 deletions llvm/test/CodeGen/AMDGPU/sched-no-schedmodel.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -misched-cluster=false --misched-prera-direction=topdown -run-pass=machine-scheduler --schedmodel=1 -o - %s | FileCheck -check-prefix=GCN %s
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -misched-cluster=false --misched-prera-direction=topdown -run-pass=machine-scheduler --schedmodel=0 -o - %s | FileCheck -check-prefix=GCN-NO-SCHEDMODEL %s

---
name: sched_group_barrier_1_VMEM_READ_1_VALU_5_MFMA_1_VMEM_READ_3_VALU_2_VMEM_WRITE
tracksRegLiveness: true
body: |
bb.0:

; GCN-LABEL: name: sched_group_barrier_1_VMEM_READ_1_VALU_5_MFMA_1_VMEM_READ_3_VALU_2_VMEM_WRITE
; GCN: [[DEF:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
; GCN-NEXT: [[DEF1:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
; GCN-NEXT: early-clobber %2:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: [[DEF2:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NEXT: dead [[DS_READ_U16_gfx9_:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF2]], 0, 0, implicit $exec
; GCN-NEXT: [[DEF3:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NEXT: dead [[DS_READ_U16_gfx9_1:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF3]], 0, 0, implicit $exec
; GCN-NEXT: [[DEF4:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NEXT: dead [[DS_READ_U16_gfx9_2:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF4]], 0, 0, implicit $exec
; GCN-NEXT: [[V_MUL_LO_U32_e64_:%[0-9]+]]:vgpr_32 = nsw V_MUL_LO_U32_e64 %2.sub0, %2.sub1, implicit $exec
; GCN-NEXT: early-clobber %3:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
; GCN-NEXT: S_ENDPGM 0, implicit %2, implicit %3, implicit [[V_MUL_LO_U32_e64_]]
;
; GCN-NO-SCHEDMODEL-LABEL: name: sched_group_barrier_1_VMEM_READ_1_VALU_5_MFMA_1_VMEM_READ_3_VALU_2_VMEM_WRITE
; GCN-NO-SCHEDMODEL: [[DEF:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
; GCN-NO-SCHEDMODEL-NEXT: [[DEF1:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
; GCN-NO-SCHEDMODEL-NEXT: early-clobber %2:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
; GCN-NO-SCHEDMODEL-NEXT: early-clobber %3:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
; GCN-NO-SCHEDMODEL-NEXT: [[V_MUL_LO_U32_e64_:%[0-9]+]]:vgpr_32 = nsw V_MUL_LO_U32_e64 %2.sub0, %2.sub1, implicit $exec
; GCN-NO-SCHEDMODEL-NEXT: [[DEF2:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NO-SCHEDMODEL-NEXT: dead [[DS_READ_U16_gfx9_:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF2]], 0, 0, implicit $exec
; GCN-NO-SCHEDMODEL-NEXT: [[DEF3:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NO-SCHEDMODEL-NEXT: dead [[DS_READ_U16_gfx9_1:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF3]], 0, 0, implicit $exec
; GCN-NO-SCHEDMODEL-NEXT: [[DEF4:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
; GCN-NO-SCHEDMODEL-NEXT: dead [[DS_READ_U16_gfx9_2:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF4]], 0, 0, implicit $exec
; GCN-NO-SCHEDMODEL-NEXT: S_ENDPGM 0, implicit %2, implicit %3, implicit [[V_MUL_LO_U32_e64_]]
%0:vreg_128_align2 = IMPLICIT_DEF
%1:vreg_128_align2 = IMPLICIT_DEF
%2:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 %0.sub0_sub1:vreg_128_align2, %1.sub0_sub1:vreg_128_align2, 0, 0, 0, 0, implicit $mode, implicit $exec
%3:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 %0.sub0_sub1:vreg_128_align2, %1.sub0_sub1:vreg_128_align2, 0, 0, 0, 0, implicit $mode, implicit $exec
%4:vgpr_32 = nsw V_MUL_LO_U32_e64 %2.sub0, %2.sub1, implicit $exec
%5:vgpr_32 = IMPLICIT_DEF
%6:vgpr_32 = DS_READ_U16_gfx9 %5, 0, 0, implicit $exec
%7:vgpr_32 = IMPLICIT_DEF
%8:vgpr_32 = DS_READ_U16_gfx9 %7, 0, 0, implicit $exec
%9:vgpr_32 = IMPLICIT_DEF
%10:vgpr_32 = DS_READ_U16_gfx9 %9, 0, 0, implicit $exec
S_ENDPGM 0, implicit %2, implicit %3, implicit %4
...