Skip to content

Commit 87d7711

Browse files
authored
[AMDGPU][SIMemoryLegalizer] Fix order of GL0/1_INV on GFX10/11 (llvm#81450)
Fixes SWDEV-443292
1 parent a1efe56 commit 87d7711

17 files changed

+1574
-1571
lines changed

llvm/docs/AMDGPUUsage.rst

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -12291,8 +12291,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1229112291
before invalidating
1229212292
the caches.
1229312293

12294-
3. buffer_gl0_inv;
12295-
buffer_gl1_inv
12294+
3. buffer_gl1_inv;
12295+
buffer_gl0_inv
1229612296

1229712297
- Must happen before
1229812298
any following
@@ -12321,8 +12321,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1232112321
before invalidating
1232212322
the caches.
1232312323

12324-
3. buffer_gl0_inv;
12325-
buffer_gl1_inv
12324+
3. buffer_gl1_inv;
12325+
buffer_gl0_inv
1232612326

1232712327
- Must happen before
1232812328
any following
@@ -12428,8 +12428,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1242812428
invalidating the
1242912429
caches.
1243012430

12431-
3. buffer_gl0_inv;
12432-
buffer_gl1_inv
12431+
3. buffer_gl1_inv;
12432+
buffer_gl0_inv
1243312433

1243412434
- Must happen before
1243512435
any following
@@ -12459,8 +12459,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1245912459
invalidating the
1246012460
caches.
1246112461

12462-
3. buffer_gl0_inv;
12463-
buffer_gl1_inv
12462+
3. buffer_gl1_inv;
12463+
buffer_gl0_inv
1246412464

1246512465
- Must happen before
1246612466
any following
@@ -12655,8 +12655,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1265512655
the
1265612656
fence-paired-atomic.
1265712657

12658-
2. buffer_gl0_inv;
12659-
buffer_gl1_inv
12658+
2. buffer_gl1_inv;
12659+
buffer_gl0_inv
1266012660

1266112661
- Must happen before any
1266212662
following global/generic
@@ -13369,8 +13369,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1336913369
invalidating the
1337013370
caches.
1337113371

13372-
4. buffer_gl0_inv;
13373-
buffer_gl1_inv
13372+
4. buffer_gl1_inv;
13373+
buffer_gl0_inv
1337413374

1337513375
- Must happen before
1337613376
any following
@@ -13444,8 +13444,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1344413444
invalidating the
1344513445
caches.
1344613446

13447-
4. buffer_gl0_inv;
13448-
buffer_gl1_inv
13447+
4. buffer_gl1_inv;
13448+
buffer_gl0_inv
1344913449

1345013450
- Must happen before
1345113451
any following
@@ -13672,8 +13672,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
1367213672
requirements of
1367313673
release.
1367413674

13675-
2. buffer_gl0_inv;
13676-
buffer_gl1_inv
13675+
2. buffer_gl1_inv;
13676+
buffer_gl0_inv
1367713677

1367813678
- Must happen before
1367913679
any following

llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2030,8 +2030,11 @@ bool SIGfx10CacheControl::insertAcquire(MachineBasicBlock::iterator &MI,
20302030
switch (Scope) {
20312031
case SIAtomicScope::SYSTEM:
20322032
case SIAtomicScope::AGENT:
2033-
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL0_INV));
2033+
// The order of invalidates matter here. We must invalidate "outer in"
2034+
// so L1 -> L0 to avoid L0 pulling in stale data from L1 when it is
2035+
// invalidated.
20342036
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL1_INV));
2037+
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL0_INV));
20352038
Changed = true;
20362039
break;
20372040
case SIAtomicScope::WORKGROUP:

0 commit comments

Comments
 (0)