Skip to content

[sycl-post-link] Don't remove llvm.compiler.used for NVPTX #15224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions llvm/test/CodeGen/AMDGPU/remove-compiler-used.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
;; This test verifies llc on AMDGCN will delete the llvm.compiler.used symbol
;; while keeping the symbol in the outputted ASM.

; RUN: llc < %s -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 | FileCheck %s
; RUN: llc < %s -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 | FileCheck %s
; RUN: llc < %s -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a | FileCheck %s

@keep_this = internal global i32 2, align 4
@llvm.compiler.used = appending global [1 x ptr] [ptr @keep_this], section "llvm.metadata"

; CHECK-NOT: llvm.metadata
; CHECK-NOT: llvm{{.*}}used
; CHECK-NOT: llvm{{.*}}compiler{{.*}}used

; CHECK: .type keep_this,@object ;

; CHECK-NOT: llvm.metadata
; CHECK-NOT: llvm{{.*}}used
; CHECK-NOT: llvm{{.*}}compiler{{.*}}used
17 changes: 17 additions & 0 deletions llvm/test/CodeGen/NVPTX/remove-compiler-used.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
;; This test verifies llc on NVPTX will delete the llvm.compiler.used symbol
;; while keeping the symbol in the outputted ASM.

; RUN: llc < %s -march=nvptx64 | FileCheck %s

@keep_this = internal global i32 2, align 4
@llvm.compiler.used = appending global [1 x ptr] [ptr @keep_this], section "llvm.metadata"

; CHECK-NOT: llvm.metadata
; CHECK-NOT: llvm{{.*}}used
; CHECK-NOT: llvm{{.*}}compiler{{.*}}used

; CHECK: .global .align 4 .u32 keep_this

; CHECK-NOT: llvm.metadata
; CHECK-NOT: llvm{{.*}}used
; CHECK-NOT: llvm{{.*}}compiler{{.*}}used
10 changes: 5 additions & 5 deletions llvm/tools/sycl-post-link/sycl-post-link.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -784,11 +784,11 @@ processInputModule(std::unique_ptr<Module> M) {
Modified |= removeSYCLKernelsConstRefArray(*M.get());

// There may be device_global variables kept alive in "llvm.compiler.used"
// to keep the optimizer from wrongfully removing them. Since it has served
// its purpose, these device_global variables can be removed. If they are not
// used inside the device code after they have been removed from
// "llvm.compiler.used" they can be erased safely.
Modified |= removeDeviceGlobalFromCompilerUsed(*M.get());
// to keep the optimizer from wrongfully removing them. llvm.compiler.used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We intend to replace the translator with SPIR-V backend eventually. Can we remove the check at that point of time?

Thanks

Copy link
Contributor Author

@hdelan hdelan Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the SPIRV backend is being used instead of llvm-spirv I think it'd be better to remove the llvm.compiler.used symbol in the backend, instead of in sycl-post-link, although some might disagree. I think it's OK to keep the current behaviour for the moment unless it becomes a problem later.

// symbols are usually removed at backend lowering, but this is handled here
// for SPIR-V since SYCL compilation uses llvm-spirv, not the SPIR-V backend.
if (auto Triple = M->getTargetTriple().find("spir") != std::string::npos)
Modified |= removeDeviceGlobalFromCompilerUsed(*M.get());

// Instrument each image scope device globals if the module has been
// instrumented by sanitizer pass.
Expand Down
31 changes: 31 additions & 0 deletions sycl/test-e2e/DeviceGlobal/device_global_static.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
//
// The OpenCL GPU backends do not currently support device_global backend
// calls.
// UNSUPPORTED: opencl && gpu
//
// Tests static device_global access through device kernels.

#include "common.hpp"

static device_global<int[4], TestProperties> DeviceGlobalVar;

int main() {
queue Q;

Q.single_task([=]() { DeviceGlobalVar.get()[0] = 42; });
// Make sure that the write happens before subsequent read
Q.wait();

int OutVal = 0;
{
buffer<int, 1> OutBuf(&OutVal, 1);
Q.submit([&](handler &CGH) {
auto OutAcc = OutBuf.get_access<access::mode::write>(CGH);
CGH.single_task([=]() { OutAcc[0] = DeviceGlobalVar.get()[0]; });
});
}
assert(OutVal == 42 && "Read value does not match.");
return 0;
}
36 changes: 36 additions & 0 deletions sycl/test/check_device_code/device_global_static.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
// Tests that the llvm.compiler.used symbol, which is used to implement static
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sycl/test instead of sycl-post-link test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of a mess but the llvm.used symbol is handled in different parts of the compilation depending on backend. So it's important to make sure that the tests are working for an entire compilation pipeline, not just for a particular stage of the action.

Copy link
Contributor

@aelovikov-intel aelovikov-intel Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the latest stage it is being removed? Why can't we have test for that component? Is the "mess" final, or do we expect pipelines to be unified?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm.used should survive to final linking, llvm.compiler.used is okay for intermediary tools to drop: https://llvm.org/docs/LangRef.html#the-llvm-used-global-variable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a test when programmer uses __attribute__((used)) on something to ensure that something isn't being dropped?

Copy link
Contributor Author

@hdelan hdelan Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Device image compression will be optional (#15124) - unless you specify --offload-compress in CLI, by default, we won't compress images. So, image compression might not be an issue for this particular test.

If image compression is "opt-in" and we will not opt in in this test, then why is this an issue @aelovikov-intel ? It seems unlikely that binary formatting is going to change drastically in the next while, and if it does then I think this test will be the least of our problems.

I am using strings as it's a simpler, more portable way to find a global symbol, using the current device code formats, than using proprietary binutils that are likely to belong to proprietary toolkits. For instance cuobjdump is not necessarily distributed in every CUDA version. Relying on a tool like this may break CI with a CUDA toolkit version change. Likewise I don't want to rely on a tool like spirv-dis which often breaks and doesn't keep up to date with the latest spir-v features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for response @uditagarwal97 :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the test to use llvm-strings instead of strings. Is this sufficient @aelovikov-intel ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I just delete the test will you approve this PR? I don't think this discussion is going anywhere

Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If image compression is "opt-in" and we will not opt in in this test, then why is this an issue @aelovikov-intel ?

Because that's an implementation detail of something unrelated to your test and it must not make any assumptions of it.

// device globals, is removed at some point in compilation. For SPIR-V this
// symbol is removed at sycl-post-link and for NVPTX/AMDGCN it is removed at
// lowering.
//
// It also checks that the symbol can be found in an object file for a given
// triple, thus validating that `llvm-strings` can successfully be used to
// check for the presence of the symbol.

// UNSUPPORTED: windows

// RUN: %clangxx -fsycl -fsycl-device-only %s -o %t
// RUN: llvm-strings %t | grep "llvm.compiler.used"
// RUN: %clangxx -fsycl %s -o %t
// RUN: llvm-strings %t | not grep "llvm.compiler.used"

// RUN: %if cuda %{ %clangxx -fsycl -fsycl-device-only -fsycl-targets=nvptx64-nvidia-cuda %s -o %t %}
// RUN: %if cuda %{ llvm-strings %t | grep "llvm.compiler.used" %}
// RUN: %if cuda %{ %clangxx -fsycl -fsycl-targets=nvptx64-nvidia-cuda %s -o %t %}
// RUN: %if cuda %{ llvm-strings %t | not grep "llvm.compiler.used" %}

// RUN: %if hip_amd %{ %clangxx -fsycl -fsycl-device-only -fsycl-targets=amd_gpu_gfx906 %s -o %t %}
// RUN: %if hip_amd %{ llvm-strings %t | grep "llvm.compiler.used" %}
// RUN: %if hip_amd %{ %clangxx -fsycl -fsycl-targets=amd_gpu_gfx906 %s -o %t %}
// RUN: %if hip_amd %{ llvm-strings %t | not grep "llvm.compiler.used" %}

#include <sycl/sycl.hpp>

using namespace sycl;
using namespace sycl::ext::oneapi::experimental;

static device_global<int> DeviceGlobalVar;

int main() {
sycl::queue{}.single_task([=] { volatile int ReadVal = DeviceGlobalVar; });
}
Loading