-
Notifications
You must be signed in to change notification settings - Fork 13.5k
clang++ incompatibility with libstdc++ gcc4-compatible ABI on Windows #135910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't have rights to set labels, assuming there are more relevant ones. |
MSYS2's |
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
I'm also seeing a crash with the following simple "hello world" program. I haven't been able to see a relevant difference in the -S output yet (I see something fishy around #include <iostream>
int main() {
std::cout << "Hello" << std::endl;
return 0;
}
This crashes on Cygwin but not on MSYS2, so it seems likely to be due to the C++11 ABI option. |
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
I haven't really studied the libstdc++ code closely here to understand what's going on... Is this a case where the Cygwin GCC build somehow sets different ABI defines somewhere, that we should try to match? From the fix attempts, it seems so? It's a bit annoying if this is based on a configure option, which sets custom hardcoded values for that GCC build, but which may change at some point in the future... For reference, see 4e6c8f1, where I added a similar libstdc++ define for a different case. That one was attempting to fix a define that GCC itself sets in That particular commit adds it in |
No, the "fix attempts" are really more "workaround attempts" that actually switch to the other of the "dual abi" because that one is not broken. The default ABI on Cygwin (due to gcc configure option, kept to keep compatibility with old libaries) is broken with Clang. It apparently doesn't treat the static member of an extern explicit template class instantiation as |
Thanks, I see! Ok that makes it clearer what to look for here. Is it possible to reduce a minimal testcase with the problematic code without any includes at all? Does Clang do the right thing for a mingw target, for such a testcase, or is this a case which we just haven't hit there either? |
Clang does not do the "right thing" (doesn't do the same thing as GCC, up to someone else to decide which is "right") for mingw target either, which convinced me to open this issue. I might be able to put something simple together, I don't know if it requires a DLL in the mix or if it will reproduce with just two TUs linked together. |
Slightly more complicated than I expected, but not too bad for a simple reproducer: #include <stdio.h>
template<typename T>
class foo
{
public:
class inner
{
public:
static char buffer[];
} i;
};
template<typename T>
char foo<T>::inner::buffer[10];
extern template class foo<char>;
int main(void)
{
foo<char> bar;
printf("%p\n", bar.i.buffer);
return 0;
} it should give an undefined reference if you try to link it, like
but clang++ succeeds, because it doesn't treat |
Thanks, I see! I also can observe that Clang does behave like GCC here, for a Linux target. So there's some target specific logic that ends up going differently between the mingw/cygwin targets and Linux here. |
I've reduced the case down a bit further: template<typename T>
class foo
{
public:
class inner
{
public:
static char buffer[];
} i;
};
template<typename T>
char foo<T>::inner::buffer[10];
extern template class foo<char>;
void other(void *ptr);
void func(void)
{
foo<char> bar;
other(bar.i.buffer);
} (Removed the include and printf, so it should compile as such anywhere without any SDK available.) If compiled with @_ZN3fooIcE5inner6bufferE = external global [0 x i8], align 1 While for a $_ZN3fooIcE5inner6bufferE = comdat any
@_ZN3fooIcE5inner6bufferE = linkonce_odr dso_local global [10 x i8] zeroinitializer, comdat, align 1 So something in Clang on the codegen level is producing different things for these two targets. |
I tend to use printf and main for my reproducers, but I can see how including stdio could add unnecessary overhead. |
Yes, using that is clearer when one wants to link and execute it. But for comparing code generation, it’s simpler to omit them, so one can generate code for any of the near-infinite number of arch/OS targets that clang supports :-) Plus there’s less things involved in the translation unit. |
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
…fined work around llvm#135910 see also msys2/MSYS2-packages#5329
I found the problem disappears with the patch: --- origsrc/clang-20.1.3.src/lib/Sema/SemaTemplateInstantiate.cpp 2025-04-16 09:23:49.000000000 +0900
+++ src/clang-20.1.3.src/lib/Sema/SemaTemplateInstantiate.cpp 2025-05-04 22:24:26.010847400 +0900
@@ -4360,6 +4360,7 @@
== TSK_ExplicitSpecialization)
continue;
+#if 0
if (Context.getTargetInfo().getTriple().isOSWindows() &&
TSK == TSK_ExplicitInstantiationDeclaration) {
// On Windows, explicit instantiation decl of the outer class doesn't
@@ -4370,6 +4371,7 @@
// that users don't end up with undefined symbols during linking.
continue;
}
+#endif
if (CheckSpecializationInstantiationRedecl(PointOfInstantiation, TSK,
Record, But I don't understand enough what the comment means. if (Context.getTargetInfo().getTriple().isOSWindows() &&
TSK == TSK_ExplicitInstantiationDeclaration) {
// On Windows, explicit instantiation decl of the outer class doesn't
// affect the inner class. Typically extern template declarations are
// used in combination with dll import/export annotations, but those
// are not propagated from the outer class templates to inner classes.
// Therefore, do not instantiate inner classes on this platform, so
// that users don't end up with undefined symbols during linking.
continue;
}
|
Great! I think
|
My first thought was adding |
foo.h: template<typename T>
class foo
{
public:
class inner
{
public:
static char buffer[];
} i;
}; bar.cc: #include "foo.h"
template<typename T>
char foo<T>::inner::buffer[10];
__declspec(dllexport) foo<char> bar; main.cc: #include <stdio.h>
#include "foo.h"
template<typename T>
char foo<T>::inner::buffer[10];
extern foo<char> bar;
int main(void)
{
printf("%p\n", bar.i.buffer);
return 0;
} With these files,
and
Both of above successfully linked. I'm not sure how to think about this outcome. |
I'm just working for similar problem.
According to test CodeGenCXX/windows-itanium-dllexport.cpp, this behavior required by not only WindowsMSVC but also Itanium, PS4, PS5. So I think cannot use isWindowsMSVCEnvironment here, && !isOSCygMing is correct. main...kikairoya:llvm-project:cygwin_abi Notable behavior:
shows address in .exe module, not in .dll. |
Cool. Some of those look like what @mati865 did already (https://github.com/llvm/llvm-project/pull/134458/files) but you split them up in a way that might actually be acceptable to merge. And personally, I wouldn't wait for all the tests to pass - I'm not anybody on this project, but I think incremental improvements are nice too! They also seem to appreciate smaller pull requests here. |
I started working on creating unit tests but more important things got my attention. I'd welcome if somebody takes over. |
woah, I just noticed
So in c++03 sentry already has template vis. It was removed from the normal header in #134885. But template vis isn't defined to anything on coff... 😦 |
I was trying with patch like this (thank you for make CI run on llvm-mingw) # if defined(__MINGW32__) || defined(__CYGWIN__)
extern template _LIBCPP_EXTERN_TEMPLATE_TYPE_VIS basic_ostream<char>::sentry::sentry(basic_ostream<char>& __os);
extern template _LIBCPP_EXTERN_TEMPLATE_TYPE_VIS basic_ostream<char>::sentry::~sentry();
# endif but I think scattering I'll test if this approach fine but my local environment has broken, building libc++.dll says |
Found it. BFD ld (COFF) cannot export weak symbol. Need to use LLD. |
To keep me clear myself, list once more how compilers treat
Clang has |
@jeremyd2019 's test https://github.com/jeremyd2019/llvm-mingw/actions/runs/15080594305 has completed so we can consider affected entities in libc++ are std::ostream::sentry and std::istream::sentry only. I have a patch introduces new keyword 92a3b90 . It can be merged independently prior to patch clang. If this approach is acceptable, it's better to open a new PR for this new keyword, right? |
Yes.
Is there anything to be done for compatibility with mingw-gcc? I mean, it already doesn't work, but it'd be cool if what we do here could help that as well. |
May this change need a new option like -f{a-short-nice-description-of-this-problem}-compatibility? If not, I have no other concern about mingw-gcc compatibility, both related or not to this issue. |
I'd have to defer to others on that, but I think no. By "compatibility with mingw-gcc" I mean libc++ with gcc. |
OK, I understand. |
OK, so here's what I'm currently thinking (and @mstorsjo feel free to correct me):
|
Thank you for your kind explanation and I apologize for my bad reading. This time I think I understand what you are talking about. for 1. and 2.: main...kikairoya:llvm-project:libcxx-new-visibility-keyword-for-compat-mingw for 3.: If I may ask you to help me, could you polish descriptions I wrote? |
Actually, I don't think putting I believe the best ABI-breaking solution would be for all members to be
This doesn't seem right, not least because there's an operator missing between the |
I had forgotten that about MSVC does. Probably, with this solution, it's not able to remove this new keyword simply even in the future?
This seems to be a viable option. I think now, it might be better to apply
It's unveiled I didn't test after format manually... ( clang-format is disabled here )
I added it to minimize of side effect but it was better to did not as you say to believe GCC will support in the future. I have to review myself what my patch does and should do once again. |
This will keep ABI (I'm saying without any testing), and according to https://releases.llvm.org/20.1.0/projects/libcxx/docs/DesignDocs/ABIVersioning.html , keeping ABI strictly on Windows is not required ( it says about MSVC but seems to be same for MinGW. ) |
Or
I don't have any idea whether or not GCC might add support for that, but it's generally better to test for features than versions (or compiler vendors). |
It's really going to need feedback from the libc++ reviewers to know for sure, but I'd guess #if defined (__MINGW32__) || defined (__CYGWIN__)
... _LIBCPP_HIDE_FROM_ABI
#else
... _LIBCPP_HIDE_FROM_ABI_AFTER_V1
#endif The big question is whether this is a new define in __config or something in basic_ostream.h and istream... Probably up to the libc++ reviewers what's less objectionable. I've found that the 'gcc' job in 'stage1' fails unless the function bodies are inside the class definition when |
For MSVC we can break ABI in libc++ if we need to. For mingw there’s no formal guarantee, but we would like to avoid breaking it a lot. A small low impact break may be fine, but one that breaks backwards compat for all existing binaries built against libc++.dll works be very painful for msys2, and for llvm-mingw I wouldn’t like to ship that either. |
Adding keyword |
Yes that is. The new patch is here but still incomplete to ready to be reviewed. I have tried to run CI tests on github-provided worker but aborted due to disk full. I may need to tweak scripts to save storage or to prepare a self-hosted worker. |
I would say new member functions of inner classes (new or not) should be
shouldn't the |
You're right. Required for new member functions regardless classes are new or not and here must be
Adding |
adding |
Oh, that is. |
It's painful to run 'gcc-14', that requires massive amount of RAM...
Perhaps instability of CI tests are caused by OOM killer destroying docker process. |
I guess it could be a contributing factor, but the custom runners also run on some sort of cluster where the runner can be interrupted for some paying customer job, so it’s somewhat by design that they will be interrupted, to some extent. There’s some workflow that should try to restart them automatically as needed, but it’s of course not the most convenient thing. (And having privileges to restart the jobs manually does help.) |
I think we're about at a point that a PR would be helpful. I know there's some wordsmithing left to do, but it's much more convenient to review and make suggestions against a PR than a bare branch. |
I've posted the PR. Though CI is still unstable as ever, may I ask you to check it before set as "ready for review"? And, I'm trying to customized CI build to verify ABI is kept on DLL built by mingw-clang and clang-cl. It would be nice to be able to verify ABI stability in CI like Linux, at least for MinGW target. |
Indeed it would. This was actually discussed quite recently in #140507 (comment) - I haven't had time to look into it myself yet. But patches that add abilist support for mingw (and msvc, even if the ABI is unstable) would probably be very welcome (and very much appreciated by me); that makes it clearer if there are changes to the set of symbols exported. If working on that, it may be relevant to use |
Symbols in export table and import table of DLLs from To extract:
This includes export ordinals but they aren't used as I know (and, they are uncontrollable) so should be omitted when integrating such test. |
This was first seen on Cygwin (x86_64-pc-windows-cygnus), where GCC is built with
--with-default-libstdcxx-abi=gcc4-compatible
, but I reproduced it on x86_64-pc-windows-gnu as well, so I decided to report here.Start with a mingw-w64 gcc built without
--enable-fully-dynamic-string
configure argument, and possibly with--with-default-libstdcxx-abi=gcc4-compatible
. It should be possible to switch to the gcc4-compatible ABI with-D_GLIBCXX_USE_CXX11_ABI=0
if not built with that option.The following test program aborts when built with
-O2
, unless built with-std=c++20
or newer:Output of
clang++ -O2 -S -o test.clang.s test.cpp
test.clang.s.txt
output of
g++ -O2 -S -o test.gcc.s test.cpp
test.gcc.s.txt
relevant difference:
vs
relevant bit of libstdc++ header:
_GLIBCXX_EXTERN_TEMPLATE
is 1, and the issue manifests when_GLIBCXX_USE_CXX11_ABI
is 0 and__cplusplus
is <= 201703.see msys2/MSYS2-packages#5329 for history of the investigation.
/cc @mstorsjo
The text was updated successfully, but these errors were encountered: