Skip to content

[SPIR-V] Allow intrinsics with aggregate return type to reach GlobalISel #108893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

VyacheslavLevytskyy
Copy link
Contributor

@VyacheslavLevytskyy VyacheslavLevytskyy commented Sep 16, 2024

Two main goals of this PR are:

  • to support "Arithmetic with Overflow" intrinsics, including the special case when those intrinsics are being generated by the CodeGenPrepare pass during translations with optimization;
  • to redirect intrinsics with aggregate return type to be lowered via GlobalISel operations instead of SPIRV-specific unfolding/lowering (see [SPIR-V] Lower llvm.x.with.overflow intrinsics #95012).

There is a new test case llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll that describes and checks the general logics of the translation.

This PR continues a series of PRs aimed to identify and fix flaws in code emission, to improve pass rates for the mode with expensive checks set on (see #101732, #104104, #106966), having in mind the ultimate goal of proceeding towards the non-experimental status of SPIR-V Backend.

The reproducers are:

  1. consider llc -O3 -mtriple=spirv64-unknown-unknown ... with:
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:
  %e = phi i32 [ %a, %entry ], [ %i, %body ]
  %i = add nsw i32 %e, 1
  %fl = icmp eq i32 %i, 0
  br i1 %fl, label %exit, label %body

body:
  store i8 42, ptr addrspace(4) %p
  br label %l1

exit:
  ret i32 %i
}
  1. consider llc -O0 -mtriple=spirv64-unknown-unknown ... with:
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:                                               ; preds = %body, %entry
  %e = phi i32 [ %a, %entry ], [ %math, %body ]
  %0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
  %math = extractvalue { i32, i1 } %0, 0
  %ov = extractvalue { i32, i1 } %0, 1
  br i1 %ov, label %exit, label %body

body:                                             ; preds = %l1
  store i8 42, ptr addrspace(4) %p, align 1
  br label %l1

exit:                                             ; preds = %l1
  ret i32 %math
}

Copy link

github-actions bot commented Sep 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@VyacheslavLevytskyy VyacheslavLevytskyy force-pushed the intrinsics_with_aggregate_ret_1 branch from 616fece to ae5fe0e Compare September 18, 2024 09:25
@VyacheslavLevytskyy VyacheslavLevytskyy marked this pull request as ready for review September 18, 2024 09:38
@VyacheslavLevytskyy
Copy link
Contributor Author

FYI @efriedma-quic

@VyacheslavLevytskyy VyacheslavLevytskyy requested review from s-perron and Keenuts and removed request for Keenuts September 18, 2024 09:45
Copy link
Contributor

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay with my limited knowledge, but I did not understand one part.


bool isInternalNonVoidIntrinsic(const Value *I) {
if (const auto *II = dyn_cast<IntrinsicInst>(I))
switch (II->getIntrinsicID()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to be the list of internal intrinsics not returning void?

Copy link
Contributor Author

@VyacheslavLevytskyy VyacheslavLevytskyy Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost, but not exactly. This list doesn't include int_spv_gep, for example, because it returns ptr type. This list is to include internal intrinsics that returns a non-ptr value and so may potentially (but should not actually) participate in fake_use emission.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, isn't the return type fetchable from the CallBase parent class? Or am I missing something? (Seems like maintaining a list like that is bound to end up wrong)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. My motivation was to explore the topic further and widen applicability of this PR's way to pass info through passes, but we may extend/address this later if needed. Another motivation was that internal intrinsics are intended for quite different things, and it's hard to separate them as a class to use in conditions, but this again doesn't matter in case of this PR specifically.

So for goals of this PR it may be better indeed to be more general and don't list intrinsics explicitly, and I've changed the explicit list to F->getName().starts_with("llvm.spv.") to exclude internal SPIR-V backend intrinsics.

@VyacheslavLevytskyy VyacheslavLevytskyy force-pushed the intrinsics_with_aggregate_ret_1 branch from 97e333c to 54cc904 Compare September 20, 2024 09:44
@VyacheslavLevytskyy VyacheslavLevytskyy force-pushed the intrinsics_with_aggregate_ret_1 branch from d49e9a3 to c7a8975 Compare September 23, 2024 13:27
@VyacheslavLevytskyy VyacheslavLevytskyy merged commit a059b29 into llvm:main Sep 26, 2024
10 checks passed
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Sep 27, 2024
…Sel (llvm#108893)

Two main goals of this PR are:
* to support "Arithmetic with Overflow" intrinsics, including the
special case when those intrinsics are being generated by the
CodeGenPrepare pass during translations with optimization;
* to redirect intrinsics with aggregate return type to be lowered via
GlobalISel operations instead of SPIRV-specific unfolding/lowering (see
llvm#95012).

There is a new test case
`llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll` that
describes and checks the general logics of the translation.

This PR continues a series of PRs aimed to identify and fix flaws in
code emission, to improve pass rates for the mode with expensive checks
set on (see llvm#101732,
llvm#104104,
llvm#106966), having in mind the
ultimate goal of proceeding towards the non-experimental status of
SPIR-V Backend.

The reproducers are:

1) consider `llc -O3 -mtriple=spirv64-unknown-unknown ...` with:

```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:
  %e = phi i32 [ %a, %entry ], [ %i, %body ]
  %i = add nsw i32 %e, 1
  %fl = icmp eq i32 %i, 0
  br i1 %fl, label %exit, label %body

body:
  store i8 42, ptr addrspace(4) %p
  br label %l1

exit:
  ret i32 %i
}
```

2) consider `llc -O0 -mtriple=spirv64-unknown-unknown ...` with:

```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
  br label %l1

l1:                                               ; preds = %body, %entry
  %e = phi i32 [ %a, %entry ], [ %math, %body ]
  %0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
  %math = extractvalue { i32, i1 } %0, 0
  %ov = extractvalue { i32, i1 } %0, 1
  br i1 %ov, label %exit, label %body

body:                                             ; preds = %l1
  store i8 42, ptr addrspace(4) %p, align 1
  br label %l1

exit:                                             ; preds = %l1
  ret i32 %math
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants