[fatlto] Add coroutine passes when using FatLTO with ThinLTO #134434

ilovepi · 2025-04-04T18:55:08Z

When coroutines are used w/ both -ffat-lto-objects and -flto=thin,
the coroutine passes are not added to the optimization pipelines.
Ensure they are added before ModuleOptimization to generate a
working ELF object.

Fixes #134409.

ilovepi · 2025-04-04T18:55:24Z

[fatlto] Add coroutine passes when using FatLTO with ThinLTO #134434 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-04-04T18:57:41Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-coroutines

Author: Paul Kirth (ilovepi)

Changes

When coroutines are used w/ both -ffat-lto-objects and -flto=thin,
the coroutine passes are not added to the optimization pipelines.
Instead, just use the default ThinLTO pipeline to generate the ELF.

Fixes #134409.

Full diff: https://github.com/llvm/llvm-project/pull/134434.diff

2 Files Affected:

(added) clang/test/CodeGenCoroutines/pr134409.cpp (+42)
(modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+8-3)

diff --git a/clang/test/CodeGenCoroutines/pr134409.cpp b/clang/test/CodeGenCoroutines/pr134409.cpp
new file mode 100644
index 0000000000000..3f3d95e191594
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/pr134409.cpp
@@ -0,0 +1,42 @@
+// An end-to-end test to make sure coroutine passes are added for thinlto.
+
+// RUN: %clang_cc1 -std=c++23 -ffat-lto-objects -flto=thin -emit-llvm %s -O3 -o - \
+// RUN:  | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+class BasicCoroutine {
+public:
+    struct Promise {
+        BasicCoroutine get_return_object() { return BasicCoroutine {}; }
+
+        void unhandled_exception() noexcept { }
+
+        void return_void() noexcept { }
+
+        std::suspend_never initial_suspend() noexcept { return {}; }
+        std::suspend_never final_suspend() noexcept { return {}; }
+    };
+    using promise_type = Promise;
+};
+
+// COM: match the embedded module, so we don't match something in it by accident.
+// CHECK: @llvm.embedded.object = {{.*}}
+// CHECK: @llvm.compiler.used = {{.*}}
+
+BasicCoroutine coro() {
+// CHECK: define {{.*}} void @_Z4corov() {{.*}} {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret void
+// CHECK-NEXT: }
+    co_return;
+}
+
+int main() {
+// CHECK: define {{.*}} i32 @main() {{.*}} {
+// CHECK-NEXT: entry:
+// CHECK-NEXT: ret i32 0
+// CHECK-NEXT: }
+    coro();
+}
+
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index a18b36ba40754..4b15e0fb5c2a7 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1688,10 +1688,15 @@ PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   MPM.addPass(
       LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All));
 
-  // Use the ThinLTO post-link pipeline with sample profiling
-  if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)
+  // ModuleSimplification does not run the coroutine passes for ThinLTOPreLink,
+  // so we need the coroutine passes to run for ThinLTO builds, otherwise they
+  // will miscompile.
+  if (ThinLTO) {
+    // TODO: determine how to only run the ThinLTODefaultPipeline when using
+    // sample profiling. Ideally, we'd be able to still use the module
+    // optimization pipeline, with additional cleanups for coroutines.
     MPM.addPass(buildThinLTODefaultPipeline(Level, /*ImportSummary=*/nullptr));
-  else {
+  } else {
     // otherwise, just use module optimization
     MPM.addPass(
         buildModuleOptimizationPipeline(Level, ThinOrFullLTOPhase::None));

ilovepi · 2025-04-04T18:59:01Z

cc: @mcatanzaro

nikic

Switching to the ThinLTO post-link pipeline will have a big impact on optimization behavior and compile-time. I think it would be safer to make a change along the lines of #126168, i.e. to schedule the necessary passes in the FatLTO pipeline. Without having looked into it closely, it would probably be okay to just schedule them before the buildModuleOptimizationPipeline call.

ilovepi · 2025-04-04T19:22:15Z

hmm, I tried adding some of the individual passes, and ran into some other errors. Let me give that another try w/ that PR as an example.

ilovepi · 2025-04-04T20:30:33Z

llvm/lib/Passes/PassBuilderPipelines.cpp

+      // TODO: replace w/ buildCoroWrapper() when it takes phase and level into
+      // consideration.
+      CGSCCPassManager CGPM;
+      CGPM.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+      CGPM.addPass(CoroAnnotationElidePass());
+      MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(std::move(CGPM)));
+      MPM.addPass(CoroCleanupPass());


Also possible to do MPM.addPass(buildModuleSimplificationPipeline(Level, ThinOrFullLTOPhase::ThinLTOPostLink));

That seems to generate better code here, since it allows the tail call in the test to get removed. But that seems a bit expensive, just to give the inliner a second chance.

I think the current PR is a good way to start, as it avoids the miscompilation in a minimal way that is suitable for backporting.

Using the module simplification pipeline here (+ the module optimization pipeline) would effectively be the same as using the ThinLTO post-link pipeline.

I ran some tests, and it looks like adding -ffat-lto-objects adds about 10% overhead to the clang build (https://llvm-compile-time-tracker.com/compare.php?from=44923d8631fb28b4de54d4210762f256c3894cef&to=2bdf721c2d37af6fcbd931d963f586478cb55f17&stat=instructions:u). On top of that, switching this from using the module optimization pipeline to the ThinLTO post-link pipeline (your original PR) adds an extra 4%: https://llvm-compile-time-tracker.com/compare.php?from=2bdf721c2d37af6fcbd931d963f586478cb55f17&to=63666418fbe19f30bf971796747a751b4e1c57f3&stat=instructions:u

Adding 4% overhead is not great, but also not terrible, so maybe it's worth it to avoid pipeline compatibility issues in a principled way. It does make the codegen for FatLTO ELF diverge more from a normal compilation though.

Ideally we'd be able to rerun the simplification pipeline, but skip the inliner pipeline for already optimized functions. (I think @aeubanks had a prototype that did that for the actual ThinLTO scenario, by looking at available_externally functions. The situation here is somewhat different.)

Hmm, 10% seems a bit high for overhead on build times, though we haven't used it too much w/ ThinLTO in our toolchain, so maybe that's it?

Looking at our build times when we enabled it in our toolchain, we saw about a 2.5% slowdown in total build time, but a 22% improvement in test time (ninja check-*). Overall that ended up being about 4.4% speedup in total time.

So, I'm not surprised it slowed down for just the build, but I am surprised it added a full 10%. Well, I guess I/O can have a lot of variance between machines, so maybe that's enough to explain it, since for ThinLTO it probably more than doubles the size of the .o.

When coroutines are used w/ both -ffat-lto-objects and -flto=thin, the coroutine passes are not added to the optimization pipelines. Ensure they are added before ModuleOptimization to generate a working ELF object. Fixes #134409.

nikic

LGTM, but PR title needs adjustment.

nikic · 2025-04-07T19:13:04Z

/cherry-pick 268c065

llvmbot · 2025-04-07T19:19:00Z

/pull-request #134711

llvm/lib/Passes/PassBuilderPipelines.cpp

…4434) When coroutines are used w/ both -ffat-lto-objects and -flto=thin, the coroutine passes are not added to the optimization pipelines. Ensure they are added before ModuleOptimization to generate a working ELF object. Fixes llvm#134409. (cherry picked from commit 268c065)

ilovepi requested review from nikic and aeubanks April 4, 2025 18:56

ilovepi marked this pull request as ready for review April 4, 2025 18:57

llvmbot added clang Clang issues not falling into any other category coroutines C++20 coroutines labels Apr 4, 2025

nikic reviewed Apr 4, 2025

View reviewed changes

tstellar added this to the LLVM 20.X Release milestone Apr 4, 2025

github-project-automation bot added this to LLVM Release Status Apr 4, 2025

github-project-automation bot moved this to Needs Triage in LLVM Release Status Apr 4, 2025

ilovepi force-pushed the users/ilovepi/fatlto-coroutines branch 3 times, most recently from 6c3ea0b to a491924 Compare April 4, 2025 20:06

ilovepi commented Apr 4, 2025

View reviewed changes

ilovepi force-pushed the users/ilovepi/fatlto-coroutines branch from a491924 to 80c3615 Compare April 4, 2025 20:33

Restrict test to only target ELF

fb945cf

nikic approved these changes Apr 5, 2025

View reviewed changes

github-project-automation bot moved this from Needs Triage to Needs Merge in LLVM Release Status Apr 5, 2025

ilovepi changed the title ~~[fatlto] Use the ThinLTO default pipeline for FatLTO~~ [fatlto] Add coroutine passes when using FatLTO with ThinLTO Apr 7, 2025

ilovepi merged commit 268c065 into main Apr 7, 2025
11 checks passed

ilovepi deleted the users/ilovepi/fatlto-coroutines branch April 7, 2025 15:41

github-project-automation bot moved this from Needs Merge to Done in LLVM Release Status Apr 7, 2025

mcatanzaro mentioned this pull request Apr 8, 2025

Backend code generator crash while running pass "X86 DAG->DAG Instruction Selection". Maybe related to coroutines. #104525

Open

vitalybuka reviewed Apr 8, 2025

View reviewed changes

llvm/lib/Passes/PassBuilderPipelines.cpp Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fatlto] Add coroutine passes when using FatLTO with ThinLTO #134434

[fatlto] Add coroutine passes when using FatLTO with ThinLTO #134434

ilovepi commented Apr 4, 2025 •

edited

Loading

ilovepi commented Apr 4, 2025

llvmbot commented Apr 4, 2025 •

edited

Loading

ilovepi commented Apr 4, 2025

nikic left a comment

ilovepi commented Apr 4, 2025

ilovepi Apr 4, 2025

nikic Apr 5, 2025

ilovepi Apr 7, 2025

nikic left a comment

nikic commented Apr 7, 2025

llvmbot commented Apr 7, 2025

[fatlto] Add coroutine passes when using FatLTO with ThinLTO #134434

[fatlto] Add coroutine passes when using FatLTO with ThinLTO #134434

Conversation

ilovepi commented Apr 4, 2025 • edited Loading

ilovepi commented Apr 4, 2025

llvmbot commented Apr 4, 2025 • edited Loading

ilovepi commented Apr 4, 2025

nikic left a comment

Choose a reason for hiding this comment

ilovepi commented Apr 4, 2025

ilovepi Apr 4, 2025

Choose a reason for hiding this comment

nikic Apr 5, 2025

Choose a reason for hiding this comment

ilovepi Apr 7, 2025

Choose a reason for hiding this comment

nikic left a comment

Choose a reason for hiding this comment

nikic commented Apr 7, 2025

llvmbot commented Apr 7, 2025

ilovepi commented Apr 4, 2025 •

edited

Loading

llvmbot commented Apr 4, 2025 •

edited

Loading