-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AArch64][Clang] Fix linker error for function multiversioning #74358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-backend-aarch64 Author: Dani (DanielKristofKiss) ChangesAArch64 part of #71706. Default version is now mangled with .default. Patch is 74.82 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74358.diff 9 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 43ea6a3ffd6e6..1d692a89f6287 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -648,6 +648,9 @@ Bug Fixes in This Version
- Fixed false positive error emitted by clang when performing qualified name
lookup and the current class instantiation has dependent bases.
Fixes (`#13826 <https://github.com/llvm/llvm-project/issues/13826>`_)
+- Fix the name of the ifunc symbol emitted for multiversion functions declared with the
+ ``target_clones`` attribute. This addresses a linker error that would otherwise occur
+ when these functions are referenced from other TUs.
Bug Fixes to Compiler Builtins
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index f2c4eb51b443d..35af8cee30e26 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -2515,6 +2515,12 @@ example, the following will emit 4 versions of the function:
__attribute__((target_clones("arch=atom,avx2","arch=ivybridge","default")))
void foo() {}
+Dispatch is done via ``ifunc`` mechanism. The assembler name of the indirect
+function is the original assembler name of the multiversioned function. For
+backward compatibility, an alias to this function is currently generated
+with an ``.ifunc`` suffix. This symbol is deprecated and will be removed
+in the future.
+
}];
}
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index dea58a7ff4146..a71d6ba49f1bb 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1688,8 +1688,10 @@ static void AppendCPUSpecificCPUDispatchMangling(const CodeGenModule &CGM,
static void AppendTargetVersionMangling(const CodeGenModule &CGM,
const TargetVersionAttr *Attr,
raw_ostream &Out) {
- if (Attr->isDefaultVersion())
+ if (Attr->isDefaultVersion()) {
+ Out << ".default";
return;
+ }
Out << "._";
const TargetInfo &TI = CGM.getTarget();
llvm::SmallVector<StringRef, 8> Feats;
@@ -1752,8 +1754,10 @@ static void AppendTargetClonesMangling(const CodeGenModule &CGM,
const TargetInfo &TI = CGM.getTarget();
if (TI.getTriple().isAArch64()) {
StringRef FeatureStr = Attr->getFeatureStr(VersionIndex);
- if (FeatureStr == "default")
+ if (FeatureStr == "default") {
+ Out << ".default";
return;
+ }
Out << "._";
SmallVector<StringRef, 8> Features;
FeatureStr.split(Features, "+");
@@ -3999,6 +4003,8 @@ void CodeGenModule::EmitMultiVersionFunctionDefinition(GlobalDecl GD,
EmitGlobalFunctionDefinition(GD.getWithMultiVersionIndex(I), nullptr);
// Ensure that the resolver function is also emitted.
GetOrCreateMultiVersionResolver(GD);
+ } else if (FD->hasAttr<TargetVersionAttr>()) {
+ GetOrCreateMultiVersionResolver(GD);
} else
EmitGlobalFunctionDefinition(GD, GV);
}
@@ -4178,8 +4184,22 @@ void CodeGenModule::emitMultiVersionFunctions() {
}
llvm::Constant *ResolverConstant = GetOrCreateMultiVersionResolver(GD);
- if (auto *IFunc = dyn_cast<llvm::GlobalIFunc>(ResolverConstant))
+ if (auto *IFunc = dyn_cast<llvm::GlobalIFunc>(ResolverConstant)) {
ResolverConstant = IFunc->getResolver();
+ if (FD->isTargetClonesMultiVersion()) {
+ const CGFunctionInfo &FI = getTypes().arrangeGlobalDeclaration(GD);
+ llvm::FunctionType *DeclTy = getTypes().GetFunctionType(FI);
+ std::string MangledName = getMangledNameImpl(
+ *this, GD, FD, /*OmitMultiVersionMangling=*/true);
+ // In prior versions of Clang, the mangling for ifuncs incorrectly
+ // included an .ifunc suffix. This alias is generated for backward
+ // compatibility and should be deprecated in the future.
+ auto *Alias = llvm::GlobalAlias::create(
+ DeclTy, 0, getMultiversionLinkage(*this, GD),
+ MangledName + ".ifunc", IFunc, &getModule());
+ SetCommonAttributes(FD, Alias);
+ }
+ }
llvm::Function *ResolverFunc = cast<llvm::Function>(ResolverConstant);
ResolverFunc->setLinkage(getMultiversionLinkage(*this, GD));
@@ -4346,10 +4366,12 @@ llvm::Constant *CodeGenModule::GetOrCreateMultiVersionResolver(GlobalDecl GD) {
// Holds the name of the resolver, in ifunc mode this is the ifunc (which has
// a separate resolver).
std::string ResolverName = MangledName;
- if (getTarget().supportsIFunc())
- ResolverName += ".ifunc";
- else if (FD->isTargetMultiVersion())
+ if (getTarget().supportsIFunc()) {
+ if (!FD->isTargetClonesMultiVersion())
+ ResolverName += ".ifunc";
+ } else if (FD->isTargetMultiVersion()) {
ResolverName += ".resolver";
+ }
// If the resolver has already been created, just return it.
if (llvm::GlobalValue *ResolverGV = GetGlobalValue(ResolverName))
diff --git a/clang/test/CodeGen/attr-target-clones-aarch64.c b/clang/test/CodeGen/attr-target-clones-aarch64.c
index 3f2f2fdd24e8a..5ea3f4a9b0b11 100644
--- a/clang/test/CodeGen/attr-target-clones-aarch64.c
+++ b/clang/test/CodeGen/attr-target-clones-aarch64.c
@@ -22,27 +22,44 @@ int __attribute__((target_clones("default"))) main() {
inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))) ftc_inline2(void) { return 2; };
-// CHECK: @__aarch64_cpu_features = external dso_local global { i64 }
-// CHECK: @ftc.ifunc = weak_odr ifunc i32 (), ptr @ftc.resolver
-// CHECK: @ftc_def.ifunc = weak_odr ifunc i32 (), ptr @ftc_def.resolver
-// CHECK: @ftc_dup1.ifunc = weak_odr ifunc i32 (), ptr @ftc_dup1.resolver
-// CHECK: @ftc_dup2.ifunc = weak_odr ifunc i32 (), ptr @ftc_dup2.resolver
-// CHECK: @ftc_inline1.ifunc = weak_odr ifunc i32 (), ptr @ftc_inline1.resolver
-// CHECK: @ftc_inline2.ifunc = weak_odr ifunc i32 (), ptr @ftc_inline2.resolver
-// CHECK: @ftc_inline3.ifunc = weak_odr ifunc i32 (), ptr @ftc_inline3.resolver
+
+
+//.
+// CHECK: @__aarch64_cpu_features = external dso_local global { i64 }
+// CHECK: @ftc.ifunc = weak_odr alias i32 (), ptr @ftc
+// CHECK: @ftc_def.ifunc = weak_odr alias i32 (), ptr @ftc_def
+// CHECK: @ftc_dup1.ifunc = weak_odr alias i32 (), ptr @ftc_dup1
+// CHECK: @ftc_dup2.ifunc = weak_odr alias i32 (), ptr @ftc_dup2
+// CHECK: @ftc_inline1.ifunc = weak_odr alias i32 (), ptr @ftc_inline1
+// CHECK: @ftc_inline2.ifunc = weak_odr alias i32 (), ptr @ftc_inline2
+// CHECK: @ftc_inline3.ifunc = weak_odr alias i32 (), ptr @ftc_inline3
+// CHECK: @ftc = weak_odr ifunc i32 (), ptr @ftc.resolver
+// CHECK: @ftc_def = weak_odr ifunc i32 (), ptr @ftc_def.resolver
+// CHECK: @ftc_dup1 = weak_odr ifunc i32 (), ptr @ftc_dup1.resolver
+// CHECK: @ftc_dup2 = weak_odr ifunc i32 (), ptr @ftc_dup2.resolver
+// CHECK: @ftc_inline1 = weak_odr ifunc i32 (), ptr @ftc_inline1.resolver
+// CHECK: @ftc_inline2 = weak_odr ifunc i32 (), ptr @ftc_inline2.resolver
+// CHECK: @ftc_inline3 = weak_odr ifunc i32 (), ptr @ftc_inline3.resolver
+//.
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc._MlseMaes(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc._Msve2(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc(
+// CHECK-LABEL: @ftc.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
+//
+//
// CHECK-LABEL: @ftc.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -62,19 +79,27 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return1:
// CHECK-NEXT: ret ptr @ftc._Msve2
// CHECK: resolver_else2:
-// CHECK-NEXT: ret ptr @ftc
+// CHECK-NEXT: ret ptr @ftc.default
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_def._Msha2(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_def._Msha2Mmemtag2(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc_def(
+// CHECK-LABEL: @ftc_def.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK-LABEL: @ftc_def.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -94,15 +119,21 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return1:
// CHECK-NEXT: ret ptr @ftc_def._Msha2
// CHECK: resolver_else2:
-// CHECK-NEXT: ret ptr @ftc_def
+// CHECK-NEXT: ret ptr @ftc_def.default
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_dup1._Msha2(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc_dup1(
+// CHECK-LABEL: @ftc_dup1.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
+//
+//
// CHECK-LABEL: @ftc_dup1.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -114,19 +145,27 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return:
// CHECK-NEXT: ret ptr @ftc_dup1._Msha2
// CHECK: resolver_else:
-// CHECK-NEXT: ret ptr @ftc_dup1
+// CHECK-NEXT: ret ptr @ftc_dup1.default
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_dup2._Mfp(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_dup2._MdotprodMcrc(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc_dup2(
+// CHECK-LABEL: @ftc_dup2.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
+//
+//
// CHECK-LABEL: @ftc_dup2.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -146,35 +185,43 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return1:
// CHECK-NEXT: ret ptr @ftc_dup2._Mfp
// CHECK: resolver_else2:
-// CHECK-NEXT: ret ptr @ftc_dup2
+// CHECK-NEXT: ret ptr @ftc_dup2.default
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @foo(
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[CALL:%.*]] = call i32 @ftc.ifunc()
-// CHECK-NEXT: [[CALL1:%.*]] = call i32 @ftc_def.ifunc()
+// CHECK-NEXT: [[CALL:%.*]] = call i32 @ftc()
+// CHECK-NEXT: [[CALL1:%.*]] = call i32 @ftc_def()
// CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[CALL]], [[CALL1]]
-// CHECK-NEXT: [[CALL2:%.*]] = call i32 @ftc_dup1.ifunc()
+// CHECK-NEXT: [[CALL2:%.*]] = call i32 @ftc_dup1()
// CHECK-NEXT: [[ADD3:%.*]] = add nsw i32 [[ADD]], [[CALL2]]
-// CHECK-NEXT: [[CALL4:%.*]] = call i32 @ftc_dup2.ifunc()
+// CHECK-NEXT: [[CALL4:%.*]] = call i32 @ftc_dup2()
// CHECK-NEXT: [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
// CHECK-NEXT: ret i32 [[ADD5]]
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_direct(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 4
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @main(
// CHECK-NEXT: entry:
// CHECK-NEXT: [[RETVAL:%.*]] = alloca i32, align 4
// CHECK-NEXT: store i32 0, ptr [[RETVAL]], align 4
-// CHECK-NEXT: [[CALL:%.*]] = call i32 @ftc_inline1.ifunc()
-// CHECK-NEXT: [[CALL1:%.*]] = call i32 @ftc_inline2.ifunc()
+// CHECK-NEXT: [[CALL:%.*]] = call i32 @ftc_inline1()
+// CHECK-NEXT: [[CALL1:%.*]] = call i32 @ftc_inline2()
// CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[CALL]], [[CALL1]]
-// CHECK-NEXT: [[CALL2:%.*]] = call i32 @ftc_inline3.ifunc()
+// CHECK-NEXT: [[CALL2:%.*]] = call i32 @ftc_inline3()
// CHECK-NEXT: [[ADD3:%.*]] = add nsw i32 [[ADD]], [[CALL2]]
// CHECK-NEXT: [[CALL4:%.*]] = call i32 @ftc_direct()
// CHECK-NEXT: [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
// CHECK-NEXT: ret i32 [[ADD5]]
+//
+//
// CHECK-LABEL: @ftc_inline1.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -202,7 +249,9 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return3:
// CHECK-NEXT: ret ptr @ftc_inline1._MrngMsimd
// CHECK: resolver_else4:
-// CHECK-NEXT: ret ptr @ftc_inline1
+// CHECK-NEXT: ret ptr @ftc_inline1.default
+//
+//
// CHECK-LABEL: @ftc_inline2.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -222,7 +271,9 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return1:
// CHECK-NEXT: ret ptr @ftc_inline2._Mfp16
// CHECK: resolver_else2:
-// CHECK-NEXT: ret ptr @ftc_inline2
+// CHECK-NEXT: ret ptr @ftc_inline2.default
+//
+//
// CHECK-LABEL: @ftc_inline3.resolver(
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
@@ -242,63 +293,93 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: resolver_return1:
// CHECK-NEXT: ret ptr @ftc_inline3._Mbti
// CHECK: resolver_else2:
-// CHECK-NEXT: ret ptr @ftc_inline3
+// CHECK-NEXT: ret ptr @ftc_inline3.default
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline1._MrngMsimd(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline1._MrcpcMpredres(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline1._Msve2-aesMwfxt(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc_inline1(
+// CHECK-LABEL: @ftc_inline1.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline2._Mfp16(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline2._MfcmaMsve2-bitperm(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc_inline2(
+// CHECK-LABEL: @ftc_inline2.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline3._Mbti(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: @ftc_inline3._MsveMsb(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
+//
+//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: @ftc_inline3(
+// CHECK-LABEL: @ftc_inline3.default(
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @ftc(
// CHECK-NOFMV-NEXT: entry:
// CHECK-NOFMV-NEXT: ret i32 0
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @ftc_def(
// CHECK-NOFMV-NEXT: entry:
// CHECK-NOFMV-NEXT: ret i32 1
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @ftc_dup1(
// CHECK-NOFMV-NEXT: entry:
// CHECK-NOFMV-NEXT: ret i32 2
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @ftc_dup2(
// CHECK-NOFMV-NEXT: entry:
// CHECK-NOFMV-NEXT: ret i32 3
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @foo(
// CHECK-NOFMV-NEXT: entry:
@@ -310,10 +391,14 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-NOFMV-NEXT: [[CALL4:%.*]] = call i32 @ftc_dup2()
// CHECK-NOFMV-NEXT: [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
// CHECK-NOFMV-NEXT: ret i32 [[ADD5]]
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @ftc_direct(
// CHECK-NOFMV-NEXT: entry:
// CHECK-NOFMV-NEXT: ret i32 4
+//
+//
// CHECK-NOFMV: Function Attrs: noinline nounwind optnone
// CHECK-NOFMV-LABEL: @main(
// CHECK-NOFMV-NEXT: entry:
@@ -327,21 +412,29 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-NOFMV-NEXT: [[CALL4:%.*]] = call i32 @ftc_direct()
// CHECK-NOFMV-NEXT: [[ADD5:%.*]] = add nsw i32 [[ADD3]], [[CALL4]]
// CHECK-NOFMV-NEXT: ret i32 [[ADD5]]
-
-// CHECK: attributes #0 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+lse,+neon" }
-// CHECK: attributes #1 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2" }
-// CHECK: attributes #2 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
-// CHECK: attributes #3 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+sha2" }
-// CHECK: attributes #4 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+mte,+neon,+sha2" }
-// CHECK: attributes #5 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon" }
-// CHECK: attributes #6 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+dotprod,+fp-armv8,+neon" }
-// CHECK: attributes #7 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+rand" }
-// CHECK: attributes #8 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+predres,+rcpc" }
-// CHECK: attributes #9 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-aes,+wfxt" }
-// CHECK: attributes #10 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon" }
-// CHECK: attributes #11 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+complxnum,+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-bitperm" }
-// CHECK: attributes #12 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti" }
-// CHECK: attributes #13 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sb,+sve" }
-
-// CHECK-NOFMV: attributes #0 = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fmv" }
-// CHECK-NOFMV: attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fmv" }
+//
+//.
+// CHECK: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+lse,+neon" }
+// CHECK: attributes #[[ATTR1:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2" }
+// CHECK: attributes #[[ATTR2:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+// CHECK: attributes #[[ATTR3:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+sha2" }
+// CHECK: attributes #[[ATTR4:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+mte,+neon,+sha2" }
+// CHECK: attributes #[[ATTR5:[0-9]+]] = { noinlin...
[truncated]
|
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
// included an .ifunc suffix. This alias is generated for backward | ||
// compatibility and should be deprecated in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// included an .ifunc suffix. This alias is generated for backward | |
// compatibility and should be deprecated in the future. | |
// included an .ifunc suffix. This alias is generated for backward | |
// compatibility. It is deprecated, and may be removed in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the original series.
287837d
to
e881c25
Compare
AArch64 part of llvm#71706. Default version is now mangled with .default. Resolver for the TargetVersion need to be emitted from the CodeGenModule::EmitMultiVersionFunctionDefinition.
e881c25
to
4a8fcef
Compare
@@ -1752,8 +1754,10 @@ static void AppendTargetClonesMangling(const CodeGenModule &CGM, | |||
const TargetInfo &TI = CGM.getTarget(); | |||
if (TI.getTriple().isAArch64()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain why AArch64 needs to be 'specially' handled here? I can't recall where but other targets get a .default mangling as well right? Why isn't AArch64 being handled wherever it is happening for other targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see for other targets "default"
is just copied as a feature string.
For AArch64 we want to have clones per feature or set of features instead of CPU.
__attribute__((target_clones("crc32", "aes+sha1")))
int foo(){..}
would give 3 clones; the default, crc32
and one for aes
plus sha1
mangled as foo.default
, foo._Mcrc32
, foo._Msha1Maes
respectively.
Also "aes+sha1" should be the same as "sha1+aes" after mangling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DanielKristofKiss slightly off topic: it's too bad default
is implied, as it makes target_clones
incompatible with target_version
for constructs like this:
__attribute__((target_clones("dotprod", "aes")))
int callee(void) {
return 42;
}
__attribute__((target_version("default")))
int callee(void) {
return 0;
}
int caller(void) {
return callee();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's too bad this incompatibility is baked into the spec. Are there other reasons why this can't be supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not supported in the original function multi versioning neither and excluded to that as we building top of it.
Anyway created a ticket for it ARM-software/acle#291 . I need to check what GCC and others say about it.
ping |
…74358) AArch64 part of llvm#71706. Default version is now mangled with .default. Resolver for the TargetVersion need to be emitted from the CodeGenModule::EmitMultiVersionFunctionDefinition.
…74358) AArch64 part of llvm#71706. Default version is now mangled with .default. Resolver for the TargetVersion need to be emitted from the CodeGenModule::EmitMultiVersionFunctionDefinition.
AArch64 part of #71706.
Default version is now mangled with .default.
Resolver for the TargetVersion need to be emitted from the
CodeGenModule::EmitMultiVersionFunctionDefinition.