Skip to content

[lld][WebAssembly] Fix spurious signature mismatch under LTO #136197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2025

Conversation

BertalanD
Copy link
Member

@BertalanD BertalanD commented Apr 17, 2025

When generating C++ vtables, Clang declares virtual functions as void(void) when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen.

When performing LTO, LLD first scans the bitcode files and creates DefinedFunction symbol table entries for their contents. After LTO codegen, they are replaced with UndefinedFunctions (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remains nullptr.

From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the isCalledDirectly flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be a void(void) vtable entry, which would then conflict with the real definition.

This commit sets isCalledDirectly to false so that the signature will only be populated when a direct call is found.

Potentially fixes godotengine/godot#104497 and emscripten-core/emscripten#10831

When generating C++ vtables, Clang declares virtual functions as
`void(void)` when their signature is not known (e.g.parameter types are
forward-declared). As WASM type checks imports, this would conflict with
the real definition during linking. Commit 59f959f introduced a
workaround for this by deferring signature assignment until a definition
or direct call is seen.

When performing LTO, LLD first scans the bitcode files and creates
`DefinedFunction` symbol table entries for their contents. After LTO
codegen, they are replaced with `UndefinedFunction`s (so that the
definitions will be pulled in from the native LTO-d files when they are
added). At this point, if a function is only referenced in bitcode, its
signature remains `nullptr`.

From here, it should have behaved like in the non-LTO case: the first
direct call sets the signature. However, as the `isCalledDirectly` flag
was set to true, the missing signature was filled in by the type of the
first reference to the function, which could be a `void(void)` vtable
entry, which would then conflict with the real definition.

This commit sets `isCalledDirectly` to false so that the signature will
only be populated when a direct call is found.

Potentially fixes godotengine/godot#104497
@BertalanD BertalanD requested a review from sbc100 April 17, 2025 20:59
@llvmbot llvmbot added the lld label Apr 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 17, 2025

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-wasm

Author: Daniel Bertalan (BertalanD)

Changes

When generating C++ vtables, Clang declares virtual functions as void(void) when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen.

When performing LTO, LLD first scans the bitcode files and creates DefinedFunction symbol table entries for their contents. After LTO codegen, they are replaced with UndefinedFunctions (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remains nullptr.

From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the isCalledDirectly flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be a void(void) vtable entry, which would then conflict with the real definition.

This commit sets isCalledDirectly to false so that the signature will only be populated when a direct call is found.

Potentially fixes godotengine/godot#104497


Full diff: https://github.com/llvm/llvm-project/pull/136197.diff

2 Files Affected:

  • (added) lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll (+45)
  • (modified) lld/wasm/LTO.cpp (+2-1)
diff --git a/lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll b/lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll
new file mode 100644
index 0000000000000..2f216e5853ff8
--- /dev/null
+++ b/lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll
@@ -0,0 +1,45 @@
+; RUN: rm -rf %t && split-file %s %t && cd %t
+; RUN: opt -thinlto-bc a.ll -o a.o
+; RUN: opt -thinlto-bc b.ll -o b.o
+; RUN: llvm-ar rcs b.a b.o
+; RUN: opt -thinlto-bc c.ll -o c.o
+
+;; Taking the address of the incorrectly declared @foo should not generate a warning.
+; RUN: wasm-ld --fatal-warnings --no-entry --export-all a.o b.a -o a.out \
+; RUN:         | FileCheck %s --implicit-check-not 'warning' --allow-empty
+
+;; But we should still warn if we call the function with the wrong signature.
+; RUN: not wasm-ld --fatal-warnings --no-entry --export-all a.o b.a c.o -o b.out 2>&1 \
+; RUN:         | FileCheck %s --check-prefix=INVALID
+
+; INVALID: error: function signature mismatch: foo
+; INVALID: >>> defined as () -> void
+; INVALID: >>> defined as () -> i32
+
+;--- a.ll
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+@ptr = constant ptr @foo
+declare void @foo()
+
+;--- b.ll
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+define i32 @foo() noinline {
+entry:
+  ret i32 42
+}
+
+;--- c.ll
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+declare void @foo()
+
+define void @invalid() {
+entry:
+    call void @foo()
+    ret void
+}
diff --git a/lld/wasm/LTO.cpp b/lld/wasm/LTO.cpp
index b9bd48acd6dc1..ab63281012eae 100644
--- a/lld/wasm/LTO.cpp
+++ b/lld/wasm/LTO.cpp
@@ -108,9 +108,10 @@ BitcodeCompiler::~BitcodeCompiler() = default;
 
 static void undefine(Symbol *s) {
   if (auto f = dyn_cast<DefinedFunction>(s))
+    // If the signature is null, there were no calls from non-bitcode objects.
     replaceSymbol<UndefinedFunction>(f, f->getName(), std::nullopt,
                                      std::nullopt, 0, f->getFile(),
-                                     f->signature);
+                                     f->signature, f->signature != nullptr);
   else if (isa<DefinedData>(s))
     replaceSymbol<UndefinedData>(s, s->getName(), 0, s->getFile());
   else

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for working on this.

replaceSymbol<UndefinedFunction>(f, f->getName(), std::nullopt,
std::nullopt, 0, f->getFile(),
f->signature);
f->signature, f->signature != nullptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should instead make SymbolTable.cpp assume that the null signature is a placeholder? i.e. today we do checkSig = ud->isCalledDirectly; but we could do checkSig = ud->isCalledDirectly && ud->signature;?

I don't feel strongly though and this approach certainly works.

Copy link
Member Author

@BertalanD BertalanD Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would fix the issue: signatureMatches() already returns true when either function has a null signature.

In the test case, after LTO, the () -> void reference from a.ll is added first. The existing undefined symbol has a null signature, but this one doesn't, so we set existingFunction->signature to () -> void:

if (!existingFunction->signature && sig)
existingFunction->signature = sig;

As isCalledDirectly is set to true, we'll report the signature mismatch when we try to add the () -> i32 definition from b.ll.

I suppose an alternative would be to only set existingFunction->signature if the undefined symbol we replace it with is called directly, but that would mean we never set the signature if the function has no call sites, only its address is taken. That might cause problems later on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. LGTM as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@BertalanD BertalanD merged commit bc48b3f into llvm:main Apr 18, 2025
14 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…6197)

When generating C++ vtables, Clang declares virtual functions as
`void(void)` when their signature is not known (e.g.parameter types are
forward-declared). As WASM type checks imports, this would conflict with
the real definition during linking. Commit 59f959f introduced a
workaround for this by deferring signature assignment until a definition
or direct call is seen.

When performing LTO, LLD first scans the bitcode files and creates
`DefinedFunction` symbol table entries for their contents. After LTO
codegen, they are replaced with `UndefinedFunction`s (so that the
definitions will be pulled in from the native LTO-d files when they are
added). At this point, if a function is only referenced in bitcode, its
signature remains `nullptr`.

From here, it should have behaved like in the non-LTO case: the first
direct call sets the signature. However, as the `isCalledDirectly` flag
was set to true, the missing signature was filled in by the type of the
first reference to the function, which could be a `void(void)` vtable
entry, which would then conflict with the real definition.

This commit sets `isCalledDirectly` to false so that the signature will
only be populated when a direct call is found.

See godotengine/godot#104497 and
emscripten-core/emscripten#10831
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…6197)

When generating C++ vtables, Clang declares virtual functions as
`void(void)` when their signature is not known (e.g.parameter types are
forward-declared). As WASM type checks imports, this would conflict with
the real definition during linking. Commit 59f959f introduced a
workaround for this by deferring signature assignment until a definition
or direct call is seen.

When performing LTO, LLD first scans the bitcode files and creates
`DefinedFunction` symbol table entries for their contents. After LTO
codegen, they are replaced with `UndefinedFunction`s (so that the
definitions will be pulled in from the native LTO-d files when they are
added). At this point, if a function is only referenced in bitcode, its
signature remains `nullptr`.

From here, it should have behaved like in the non-LTO case: the first
direct call sets the signature. However, as the `isCalledDirectly` flag
was set to true, the missing signature was filled in by the type of the
first reference to the function, which could be a `void(void)` vtable
entry, which would then conflict with the real definition.

This commit sets `isCalledDirectly` to false so that the signature will
only be populated when a direct call is found.

See godotengine/godot#104497 and
emscripten-core/emscripten#10831
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…6197)

When generating C++ vtables, Clang declares virtual functions as
`void(void)` when their signature is not known (e.g.parameter types are
forward-declared). As WASM type checks imports, this would conflict with
the real definition during linking. Commit 59f959f introduced a
workaround for this by deferring signature assignment until a definition
or direct call is seen.

When performing LTO, LLD first scans the bitcode files and creates
`DefinedFunction` symbol table entries for their contents. After LTO
codegen, they are replaced with `UndefinedFunction`s (so that the
definitions will be pulled in from the native LTO-d files when they are
added). At this point, if a function is only referenced in bitcode, its
signature remains `nullptr`.

From here, it should have behaved like in the non-LTO case: the first
direct call sets the signature. However, as the `isCalledDirectly` flag
was set to true, the missing signature was filled in by the type of the
first reference to the function, which could be a `void(void)` vtable
entry, which would then conflict with the real definition.

This commit sets `isCalledDirectly` to false so that the signature will
only be populated when a direct call is found.

See godotengine/godot#104497 and
emscripten-core/emscripten#10831
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Web Export fails with Basis Universal textures
3 participants