-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[lld][WebAssembly] Fix spurious signature mismatch under LTO #136197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When generating C++ vtables, Clang declares virtual functions as `void(void)` when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen. When performing LTO, LLD first scans the bitcode files and creates `DefinedFunction` symbol table entries for their contents. After LTO codegen, they are replaced with `UndefinedFunction`s (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remains `nullptr`. From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the `isCalledDirectly` flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be a `void(void)` vtable entry, which would then conflict with the real definition. This commit sets `isCalledDirectly` to false so that the signature will only be populated when a direct call is found. Potentially fixes godotengine/godot#104497
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-wasm Author: Daniel Bertalan (BertalanD) ChangesWhen generating C++ vtables, Clang declares virtual functions as When performing LTO, LLD first scans the bitcode files and creates From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the This commit sets Potentially fixes godotengine/godot#104497 Full diff: https://github.com/llvm/llvm-project/pull/136197.diff 2 Files Affected:
diff --git a/lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll b/lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll
new file mode 100644
index 0000000000000..2f216e5853ff8
--- /dev/null
+++ b/lld/test/wasm/lto/thinlto-signature-mismatch-unknown.ll
@@ -0,0 +1,45 @@
+; RUN: rm -rf %t && split-file %s %t && cd %t
+; RUN: opt -thinlto-bc a.ll -o a.o
+; RUN: opt -thinlto-bc b.ll -o b.o
+; RUN: llvm-ar rcs b.a b.o
+; RUN: opt -thinlto-bc c.ll -o c.o
+
+;; Taking the address of the incorrectly declared @foo should not generate a warning.
+; RUN: wasm-ld --fatal-warnings --no-entry --export-all a.o b.a -o a.out \
+; RUN: | FileCheck %s --implicit-check-not 'warning' --allow-empty
+
+;; But we should still warn if we call the function with the wrong signature.
+; RUN: not wasm-ld --fatal-warnings --no-entry --export-all a.o b.a c.o -o b.out 2>&1 \
+; RUN: | FileCheck %s --check-prefix=INVALID
+
+; INVALID: error: function signature mismatch: foo
+; INVALID: >>> defined as () -> void
+; INVALID: >>> defined as () -> i32
+
+;--- a.ll
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+@ptr = constant ptr @foo
+declare void @foo()
+
+;--- b.ll
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+define i32 @foo() noinline {
+entry:
+ ret i32 42
+}
+
+;--- c.ll
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+declare void @foo()
+
+define void @invalid() {
+entry:
+ call void @foo()
+ ret void
+}
diff --git a/lld/wasm/LTO.cpp b/lld/wasm/LTO.cpp
index b9bd48acd6dc1..ab63281012eae 100644
--- a/lld/wasm/LTO.cpp
+++ b/lld/wasm/LTO.cpp
@@ -108,9 +108,10 @@ BitcodeCompiler::~BitcodeCompiler() = default;
static void undefine(Symbol *s) {
if (auto f = dyn_cast<DefinedFunction>(s))
+ // If the signature is null, there were no calls from non-bitcode objects.
replaceSymbol<UndefinedFunction>(f, f->getName(), std::nullopt,
std::nullopt, 0, f->getFile(),
- f->signature);
+ f->signature, f->signature != nullptr);
else if (isa<DefinedData>(s))
replaceSymbol<UndefinedData>(s, s->getName(), 0, s->getFile());
else
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for working on this.
replaceSymbol<UndefinedFunction>(f, f->getName(), std::nullopt, | ||
std::nullopt, 0, f->getFile(), | ||
f->signature); | ||
f->signature, f->signature != nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should instead make SymbolTable.cpp assume that the null signature is a placeholder? i.e. today we do checkSig = ud->isCalledDirectly;
but we could do checkSig = ud->isCalledDirectly && ud->signature;
?
I don't feel strongly though and this approach certainly works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that would fix the issue: signatureMatches()
already returns true
when either function has a null signature.
In the test case, after LTO, the () -> void
reference from a.ll
is added first. The existing undefined symbol has a null signature, but this one doesn't, so we set existingFunction->signature
to () -> void
:
llvm-project/lld/wasm/SymbolTable.cpp
Lines 657 to 658 in 1db03ca
if (!existingFunction->signature && sig) | |
existingFunction->signature = sig; |
As isCalledDirectly
is set to true, we'll report the signature mismatch when we try to add the () -> i32
definition from b.ll
.
I suppose an alternative would be to only set existingFunction->signature
if the undefined symbol we replace it with is called directly, but that would mean we never set the signature if the function has no call sites, only its address is taken. That might cause problems later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. LGTM as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
…6197) When generating C++ vtables, Clang declares virtual functions as `void(void)` when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen. When performing LTO, LLD first scans the bitcode files and creates `DefinedFunction` symbol table entries for their contents. After LTO codegen, they are replaced with `UndefinedFunction`s (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remains `nullptr`. From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the `isCalledDirectly` flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be a `void(void)` vtable entry, which would then conflict with the real definition. This commit sets `isCalledDirectly` to false so that the signature will only be populated when a direct call is found. See godotengine/godot#104497 and emscripten-core/emscripten#10831
…6197) When generating C++ vtables, Clang declares virtual functions as `void(void)` when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen. When performing LTO, LLD first scans the bitcode files and creates `DefinedFunction` symbol table entries for their contents. After LTO codegen, they are replaced with `UndefinedFunction`s (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remains `nullptr`. From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the `isCalledDirectly` flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be a `void(void)` vtable entry, which would then conflict with the real definition. This commit sets `isCalledDirectly` to false so that the signature will only be populated when a direct call is found. See godotengine/godot#104497 and emscripten-core/emscripten#10831
…6197) When generating C++ vtables, Clang declares virtual functions as `void(void)` when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen. When performing LTO, LLD first scans the bitcode files and creates `DefinedFunction` symbol table entries for their contents. After LTO codegen, they are replaced with `UndefinedFunction`s (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remains `nullptr`. From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the `isCalledDirectly` flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be a `void(void)` vtable entry, which would then conflict with the real definition. This commit sets `isCalledDirectly` to false so that the signature will only be populated when a direct call is found. See godotengine/godot#104497 and emscripten-core/emscripten#10831
When generating C++ vtables, Clang declares virtual functions as
void(void)
when their signature is not known (e.g.parameter types are forward-declared). As WASM type checks imports, this would conflict with the real definition during linking. Commit 59f959f introduced a workaround for this by deferring signature assignment until a definition or direct call is seen.When performing LTO, LLD first scans the bitcode files and creates
DefinedFunction
symbol table entries for their contents. After LTO codegen, they are replaced withUndefinedFunction
s (so that the definitions will be pulled in from the native LTO-d files when they are added). At this point, if a function is only referenced in bitcode, its signature remainsnullptr
.From here, it should have behaved like in the non-LTO case: the first direct call sets the signature. However, as the
isCalledDirectly
flag was set to true, the missing signature was filled in by the type of the first reference to the function, which could be avoid(void)
vtable entry, which would then conflict with the real definition.This commit sets
isCalledDirectly
to false so that the signature will only be populated when a direct call is found.Potentially fixes godotengine/godot#104497 and emscripten-core/emscripten#10831