[lld] Change `--lto-emit-llvm` to use the pre-codegen module #97480

jhuber6 · 2024-07-02T21:07:41Z

Summary:
Currently the --lto-emit-llvm option writes out the
post-internalization bitcode. This is the bitcode before any
optimizations or other pipelines have been run on it. This patch changes
that to use the pre-codegen module, which is the state of the LLVM-IR
after the optimizations have been run.

I believe that this makes sense as the --lto-emit-llvm option seems to
imply that we should emit the final output of the LLVM pass as if it
were the desired output. This should include optimizations at the
requested optimization level. My main motivation for this change is to
be able to use this to link several LLVM-IR files into a single one that
I can then pass back to ld.lld later (for JIT purposes).

Summary: Currently the `--lto-emit-llvm` option writes out the post-internalization bitcode. This is the bitcode before any optimizations or other pipelines have been run on it. This patch changes that to use the pre-codegen module, which is the state of the LLVM-IR after the optimizations have been run. I believe that this makes sense as the `--lto-emit-llvm` option seems to imply that we should emit the final output of the LLVM pass as if it were the desired output. This should include optimizations at the requested optimization level. My main motivation for this change is to be able to use this to link several LLVM-IR files into a single one that I can then pass back to `ld.lld` later (for JIT purposes).

llvmbot · 2024-07-02T21:08:12Z

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Joseph Huber (jhuber6)

Changes

Summary:
Currently the --lto-emit-llvm option writes out the
post-internalization bitcode. This is the bitcode before any
optimizations or other pipelines have been run on it. This patch changes
that to use the pre-codegen module, which is the state of the LLVM-IR
after the optimizations have been run.

I believe that this makes sense as the --lto-emit-llvm option seems to
imply that we should emit the final output of the LLVM pass as if it
were the desired output. This should include optimizations at the
requested optimization level. My main motivation for this change is to
be able to use this to link several LLVM-IR files into a single one that
I can then pass back to ld.lld later (for JIT purposes).

Full diff: https://github.com/llvm/llvm-project/pull/97480.diff

2 Files Affected:

(modified) lld/ELF/LTO.cpp (+1-1)
(modified) lld/test/ELF/lto/emit-llvm.ll (+4-2)

diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 3d92007469263..935d0a9eab9ee 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -147,7 +147,7 @@ static lto::Config createConfig() {
   c.PGOWarnMismatch = config->ltoPGOWarnMismatch;
 
   if (config->emitLLVM) {
-    c.PostInternalizeModuleHook = [](size_t task, const Module &m) {
+    c.PreCodeGenModuleHook = [](size_t task, const Module &m) {
       if (std::unique_ptr<raw_fd_ostream> os =
               openLTOOutputFile(config->outputFile))
         WriteBitcodeToFile(m, *os, false);
diff --git a/lld/test/ELF/lto/emit-llvm.ll b/lld/test/ELF/lto/emit-llvm.ll
index 01f5a056e0c0d..37488016a4bc2 100644
--- a/lld/test/ELF/lto/emit-llvm.ll
+++ b/lld/test/ELF/lto/emit-llvm.ll
@@ -9,11 +9,13 @@
 ; RUN: ld.lld --plugin-opt=emit-llvm -mllvm -bitcode-flush-threshold=0 -o /dev/null %t.o
 ; RUN: ld.lld --lto-emit-llvm -mllvm -bitcode-flush-threshold=0 -o /dev/null %t.o
 
-; CHECK: define internal void @main()
+; CHECK: define hidden void @main()
 
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
 
-define void @main() {
+@llvm.compiler.used = appending global [1 x ptr] [ptr @main], section "llvm.metadata"
+
+define hidden void @main() {
   ret void
 }

MaskRay · 2024-07-02T21:35:00Z

PreCodeGenModuleHook makes sense as that is similar to clang cc1 -emit-llvm and -emit-llvm-bc for ThinLTO backend compiles.

MaskRay · 2024-07-02T21:35:39Z

lld/test/ELF/lto/emit-llvm.ll


 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"

-define void @main() {
+@llvm.compiler.used = appending global [1 x ptr] [ptr @main], section "llvm.metadata"


The test doesn't change for PostInternal / PreCodegen. Use a different one? Perhaps two modules are needed to show a difference.

It optimizes it out completely, so I just made it used so it sticks around. Could make it slightly more complicated if you want. But, I think the test is mostly just to show "Yes it outputs something"

) Summary: Currently the `--lto-emit-llvm` option writes out the post-internalization bitcode. This is the bitcode before any optimizations or other pipelines have been run on it. This patch changes that to use the pre-codegen module, which is the state of the LLVM-IR after the optimizations have been run. I believe that this makes sense as the `--lto-emit-llvm` option seems to imply that we should emit the final output of the LLVM pass as if it were the desired output. This should include optimizations at the requested optimization level. My main motivation for this change is to be able to use this to link several LLVM-IR files into a single one that I can then pass back to `ld.lld` later (for JIT purposes).

This matches ELF (#97480). clang cc1 -emit-llvm and -emit-llvm-bc for ThinLTO backend compilation also uses `PreCodeGenModuleHook`. While here, replace deprecated %T with %t. Pull Request: #98589

Summary: This matches ELF (#97480). clang cc1 -emit-llvm and -emit-llvm-bc for ThinLTO backend compilation also uses `PreCodeGenModuleHook`. While here, replace deprecated %T with %t. Pull Request: #98589 Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250940

jhuber6 requested review from jdoerfert and MaskRay July 2, 2024 21:07

llvmbot added lld lld:ELF labels Jul 2, 2024

MaskRay reviewed Jul 2, 2024

View reviewed changes

MaskRay approved these changes Jul 2, 2024

View reviewed changes

jhuber6 merged commit 594bc52 into llvm:main Jul 2, 2024
8 of 9 checks passed

MaskRay mentioned this pull request Jul 12, 2024

[lld-link] Change /lldemit:llvm to use the pre-codegen module #98589

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lld] Change `--lto-emit-llvm` to use the pre-codegen module #97480

[lld] Change `--lto-emit-llvm` to use the pre-codegen module #97480

Uh oh!

jhuber6 commented Jul 2, 2024

Uh oh!

llvmbot commented Jul 2, 2024 •

edited

Loading

Uh oh!

MaskRay commented Jul 2, 2024

Uh oh!

MaskRay Jul 2, 2024

Uh oh!

jhuber6 Jul 2, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[lld] Change --lto-emit-llvm to use the pre-codegen module #97480

[lld] Change --lto-emit-llvm to use the pre-codegen module #97480

Uh oh!

Conversation

jhuber6 commented Jul 2, 2024

Uh oh!

llvmbot commented Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaskRay commented Jul 2, 2024

Uh oh!

MaskRay Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

jhuber6 Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[lld] Change `--lto-emit-llvm` to use the pre-codegen module #97480

[lld] Change `--lto-emit-llvm` to use the pre-codegen module #97480

llvmbot commented Jul 2, 2024 •

edited

Loading

jhuber6 Jul 2, 2024 •

edited

Loading