Skip to content

Commit f33fc10

Browse files
[SYCL] Add AMDGPU_kernel calling convention to detected kernels (#14581)
`free_function_kernels.cpp` test had a bug where the kernels with demangled name for free functions will be deleted in the `sycl-post-link` step of compilation. And this happened as AMD kernels was not detected due to a missing condition. This was not detected before as the HIP device on CI doesn't have `usm_shared_allocations` aspect available so it was detected as unsupported but when I tried it locally with a device with `usm_shared_allocations` aspect available, the test was failing.
1 parent 44d3286 commit f33fc10

File tree

4 files changed

+22
-4
lines changed

4 files changed

+22
-4
lines changed

llvm/lib/SYCLLowerIR/ModuleSplitter.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,8 @@ bool isGenericBuiltin(StringRef FName) {
113113
}
114114

115115
bool isKernel(const Function &F) {
116-
return F.getCallingConv() == CallingConv::SPIR_KERNEL;
116+
return F.getCallingConv() == CallingConv::SPIR_KERNEL ||
117+
F.getCallingConv() == CallingConv::AMDGPU_KERNEL;
117118
}
118119

119120
bool isEntryPoint(const Function &F, bool EmitOnlyKernelsAsEntryPoints) {
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
; -- Per-kernel split
2+
; RUN: sycl-post-link -split=kernel -emit-only-kernels-as-entry-points -S < %s -o %tC.table
3+
; RUN: FileCheck %s -input-file=%tC_0.ll --check-prefixes CHECK-A0
4+
; RUN: FileCheck %s -input-file=%tC_1.ll --check-prefixes CHECK-A1
5+
6+
define dso_local amdgpu_kernel void @Kernel1() {
7+
ret void
8+
}
9+
10+
define dso_local amdgpu_kernel void @Kernel2() {
11+
ret void
12+
}
13+
14+
; CHECK-A0: define dso_local amdgpu_kernel void @Kernel2()
15+
; CHECK-A0-NOT: define dso_local amdgpu_kernel void @Kernel1()
16+
; CHECK-A1-NOT: define dso_local amdgpu_kernel void @Kernel2()
17+
; CHECK-A1: define dso_local amdgpu_kernel void @Kernel1()

sycl/test-e2e/KernelAndProgram/free_function_apis.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
// RUN: %{run} %t.out
44

55
// The name mangling for free function kernels currently does not work with PTX.
6-
// UNSUPPORTED: cuda, hip
6+
// UNSUPPORTED: cuda
77

88
#include <iostream>
99
#include <sycl/detail/core.hpp>

sycl/test-e2e/KernelAndProgram/free_function_kernels.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
// RUN: %{run} %t.out
44

55
// The name mangling for free function kernels currently does not work with PTX.
6-
// UNSUPPORTED: cuda, hip
6+
// UNSUPPORTED: cuda
77

88
// This test tests free function kernel code generation and execution.
99

@@ -212,7 +212,7 @@ SYCL_EXTERNAL SYCL_EXT_ONEAPI_FUNCTION_PROPERTY((
212212
ptr2D[GId.get(0)][GId.get(1)] = LId.get(0) + LId.get(1) + start;
213213
}
214214

215-
// Explicit instantiation with int*.
215+
// Explicit instantiation with "int*".
216216
template void ff_3(int *ptr, int start);
217217

218218
bool test_3(queue Queue) {

0 commit comments

Comments
 (0)