Skip to content

Fix C++ name-mangling issue with caml_state on Cygwin#14220

Merged
dra27 merged 3 commits intoocaml:trunkfrom
gasche:cxx-api-compat-again
Sep 5, 2025
Merged

Fix C++ name-mangling issue with caml_state on Cygwin#14220
dra27 merged 3 commits intoocaml:trunkfrom
gasche:cxx-api-compat-again

Conversation

@gasche
Copy link
Copy Markdown
Member

@gasche gasche commented Sep 3, 2025

This PR is my attempt to cherry-pick the part of @MisterDA's #13777 that I suppose are required to fix #13541 again -- the compatibility of OCaml C headers with C++ compilers, which is broken on Cygwin on the current 5.4 branch.

(I kept the Changes entry in 5.4 as we intend to include this in 5.4, if merged.)

I have not tested that the patch suffices to fix the issue reported by @kit-ty-kate in #13541, so anyone able to test this without too much work is warmly welcome.

kit-ty-kate added a commit to kit-ty-kate/opam that referenced this pull request Sep 4, 2025
Copy link
Copy Markdown
Member

@kit-ty-kate kit-ty-kate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After encountering some resistance (#14224), i finally managed to build opam (which includes mccs – a C++ solver – by default) with this PR. Thanks a lot!

@Octachron
Copy link
Copy Markdown
Member

Between @kit-ty-kate test and the shortness of the patch that looks like a correct fix from a distance. @dra27 , do you have an opinion on the patch?

This causes name-mangling issues when linking. Use an intermediate
function to access the caml state.
@gasche gasche force-pushed the cxx-api-compat-again branch from db8da96 to d17093d Compare September 4, 2025 14:41
@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 4, 2025

Just checking something, but I expect it's only going to affect the title. What confused me when the original PR was opened is that it's not directly to do with TLS. The Cygwin port currently defines HAS_FULL_THREAD_VARIABLES and works - if that variable is defined in the mingw-w64 or MSVC ports then loading DLLs (e.g. #load "unix.cma") in the toplevel instantly stop working (because TLS actually doesn't work that way).

I'm trying something at the moment which will either lead to a (similarly simple) alternative or to my immediate approval of this PR 😊

@dra27 dra27 added the CI: Full matrix Makes the CI test a bigger set of configurations label Sep 4, 2025
@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 4, 2025

Right, investigations complete - this definitely is not the way to fix this. ocaml-mccs can be fixed directly with this:

diff --git a/src/context_flags.ml b/src/context_flags.ml
--- a/src/context_flags.ml
+++ b/src/context_flags.ml
@@ -28,7 +28,8 @@ let cxxflags =
     (ifc useCOIN ["-DUSECOIN"]) @
     (ifc useCLP  ["-DUSECLP"]) @
     (ifc useCBC  ["-DUSECBC"]) @
-    (ifc useSYM  ["-DUSESYM"])
+    (ifc useSYM  ["-DUSESYM"]) @
+    (ifc (Config.system = "cygwin") ["-fno-extern-tls-init"])
   in
   "(" ^ String.concat " " flags ^ ")"

and/or the compiler can be fixed with:

diff --git a/runtime/caml/misc.h b/runtime/caml/misc.h
--- a/runtime/caml/misc.h
+++ b/runtime/caml/misc.h
@@ -166,7 +166,7 @@ CAMLdeprecated_typedef(addr, char *);
 #endif

 #if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311L || \
-    defined(__cplusplus)
+    defined(__cplusplus) && !defined(__CYGWIN__)
 #define CAMLthread_local thread_local
 #else
 #define CAMLthread_local _Thread_local

The ocaml-mccs fix/workaround changes the compilation of its C++ files (which don't use TLS themselves) to force GCC to omit the helper functions.

The underlying reason is that the C++11's thread_local and C11's _Thread_local (which is just __thread) do not have the same semantics w.r.t. initialisation (see the docs starting from -fextern-tls-init). Essentially, the caml_state symbol isn't being mangled, but there is an additional one created when C++11's thread_local is being used, even inside extern "C". On ELF systems, this all gets mangled away with aliases and so forth. On Cygwin, it's either - or possibly both - of COFF and flexlink that get in the way. The linker error actually comes from flexlink - it's possible that the Cygwin build of binutils actually knows what to do at this point, but flexlink doesn't know enough about how gcc is (possibly ab-)using COFF.

I don't think it matters - the only publicly visible TLS symbol in the runtime is caml_state and, given the design of the runtime, that's really unlikely to change (because any extra TLS-like stuff in the runtime almost certainly has to go in caml_state). Which is why I propose simplifying this and just using __thread in C++ mode on Cygwin.

@kit-ty-kate
Copy link
Copy Markdown
Member

Thanks for the investigation! I'm perfectly happy to wait for it to be fixed in the compiler

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 4, 2025

To expand on the options:

  • The change in this PR switches Cygwin to use the caml_get_domain_state added in Simpler handling of caml_state on macOS and MinGW #11598. Remove dependency on libwinpthread-1.dll and GCC runtime library for mingw-w64 ports #11586 originally had a benchmarkably faster implementation for Windows (see 12bb998), but we swallowed the performance impact because that was considerably hairier as an approach. Cygwin is already slow, which makes me very reticent to adopt a concretely slower TLS mechanism only to support a scenario which affects C++ and because we're insisting on switching to a C++11 keyword long before either of the toolchains targeting Cygwin have actually removed ancient support for __thread.
  • The fact there's an easy workaround for ocaml-mccs and it's technically a 5.2 regression kinda removes the urgency for 5.4.0... that said, if we're happy with the change I propose in misc.h, it's a highly targeted alteration, so I'd be quite happy to open a PR for the misc.h change

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 4, 2025

I'm not convinced that caml_state is going to be our only TLS going forward. caml_state is per-domain information (and is only available when we own the domain lock), we may also want to keep per-thread information in the runtime and using TLS directly can be a good approach. (I see two other thread-local variables exposed in caml/*.h, namely caml_icount in instrtrace.h and caml_lockdepth in platform.h.)

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 4, 2025

the only publicly visible TLS symbol

I should perhaps have emphasised that more - by publicly I mean not guarded by CAML_INTERNALS (which the others are) and therefore potentially accessed by user-programs in the same way as caml_state. caml_icount and caml_lockdepth would have the same mangling problem, it's just that it's "never" going to be seen because no C++ program is going to have cause to link to them (caml_icount controls the interpreter loop and caml_lockdepth is part of debugging in platform.c). Even for per-thread accounting, the issue would arise here where the TLS variable is both publicly available and accessed by its name (I'm hypothesising that anything we were to add would be accessed publicly via functions).

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 4, 2025

I amended the present PR with the second suggestion of David. For the record, the initial implementation of @MisterDA did the following:

diff --git a/configure.ac b/configure.ac
index 412eec0b616..9fe8670528c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1391,7 +1391,7 @@ AS_IF([! $cc_supports_atomic],
 # macOS and MinGW-w64 have problems with thread local storage accessed from DLLs
 
 AS_CASE([$target],
-  [*-apple-darwin*|*-w64-mingw32*|*-pc-windows], [],
+  [*-apple-darwin*|*-w64-mingw32*|*-pc-windows|*-pc-cygwin], [],
   [AC_DEFINE([HAS_FULL_THREAD_VARIABLES], [1])]
 )

@kit-ty-kate
Copy link
Copy Markdown
Member

I re-tested this branch with its new content just to be sure and it still fixes the issue. Thanks!

configure.ac Outdated

AS_CASE([$target],
[*-apple-darwin*|*-w64-mingw32*|*-pc-windows], [],
[*-apple-darwin*|*-w64-mingw32*|*-pc-windows|*-pc-cygwin], [],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to configure are meant to have gone, right?

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 5, 2025

The changes to configure are still in the branch, which will take precedence, I'm afraid

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 5, 2025

Oops, my mistake, sorry for the time wasted on your side. I'll get rid of the configure changes and rebase.

@gasche gasche force-pushed the cxx-api-compat-again branch from f4d18c7 to 69584d1 Compare September 5, 2025 08:30
@gasche gasche force-pushed the cxx-api-compat-again branch from 69584d1 to bedd677 Compare September 5, 2025 08:37
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 5, 2025

(I got it wrong again by reverting the configure.ac change but not the configure change! But this should be okay now.)

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 5, 2025

Oops, my mistake, sorry for the time wasted on your side. I'll get rid of the configure changes and rebase.

No problem at all! All looks to good to me, although I'm not sure where it all sits with approvals now 😊

Copy link
Copy Markdown
Member

@kit-ty-kate kit-ty-kate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build is still successful

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 5, 2025

(This needs a green-approval from Florian or David, and then we can just merge and cherry-pick in 5.4.)

@Octachron
Copy link
Copy Markdown
Member

Naive question, from reading various discussions, I have the impression that one source of the issue is that C23 thread_local (aka _THREAD_LOCAL) is closer to constinit thread_local in C++20 rather than just thread_local. Am I missing something?

so prefer _Thread_local on Cygwin systems. */
#define CAMLthread_local thread_local
#else
#define CAMLthread_local _Thread_local
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, using _Thread_local for C++ compiler on cygwin seems like a very reasonable way to avoid semantics difference between C and C++ thread_local to me.

@dra27
Copy link
Copy Markdown
Member

dra27 commented Sep 5, 2025

Naive question, from reading various discussions, I have the impression that one source of the issue is that C23 thread_local (aka _THREAD_LOCAL) is closer to constinit thread_local in C++20 rather than just thread_local. Am I missing something?

Doesn't sound naïve to me! It's GCC and Clang 10+, apparently (@MisterDA?)

@MisterDA
Copy link
Copy Markdown
Contributor

MisterDA commented Sep 5, 2025

Naive question, from reading various discussions, I have the impression that one source of the issue is that C23 thread_local (aka _THREAD_LOCAL) is closer to constinit thread_local in C++20 rather than just thread_local. Am I missing something?

Doesn't sound naïve to me! It's GCC and Clang 10+, apparently (@MisterDA?)

I'm not aware of a change of semantics of _Thread_local between C11 and C23. In C11 thread_local is a macro defined in threads.h that resolves to _Thread_local; since C23, thread_local is itself a keyword, which may also be a predefined macro, so <threads.h> no longer provides it. A footnote reads:

These alternative keywords are obsolescent features and should not be used for new code and development.

which is why I favored the new spelling.

As we noticed C and C++ differ in the semantics of this keyword, see for instance llvm/llvm-project#70107.

When implementing thread_local as a keyword in C23, we accidentally started using C++11 thread_local semantics when using that keyword instead of using C11 _Thread_local semantics.

finding an even better C++ alternative for thread_local seems worth doing

Perhaps it's just wrong to use the C++ thread_local and we should use the C _Thread_local in C++ mode. I don't know enough about this. The declaration is already protected by extern "C"...

@MisterDA
Copy link
Copy Markdown
Contributor

MisterDA commented Sep 6, 2025

Naive question, from reading various discussions, I have the impression that one source of the issue is that C23 thread_local (aka _THREAD_LOCAL) is closer to constinit thread_local in C++20 rather than just thread_local. Am I missing something?

You're on the right track ;)

  • N2850: _Thread_local for better C++ interoperability with C (proposal for C2X and C++);

    It is observed that WG 14 document N2654, “Revise spelling of keywords v5”, proposes to replace _Thread_local with thread_local as the preferred form in the C standard. Given the information presented in this paper, the C committee may want to hold back on taking a direction which advocates increased usage of thread_local as opposed to _Thread_local.
    This paper proposes to maintain the status quo of C17 where thread_local is a macro, keeping _Thread_local, with the known differences from C++, the preferred form.

    Beyond a “mere” overhead cost, it is also worth noting that, in some environments, variables defined _Thread_local in C do not provide all of the auxiliary symbols necessary for its usage as a C++ thread_local variable; therefore, references to the C-defined variable from C++ potentially result in link errors.

    For C++, it is noted that unnecessary cost occurs in practice even with constinit thread_local where incomplete class types are involved. There may be a desire for a stronger constinit or to leverage P1247’s no_destroy attribute (or a non-attribute version of the same) to solve this issue in a more general manner; however, _Thread_local already exists in C, is implemented as an extension in C++ by Clang, and can be used to convey the intended code generation semantics. As an additional data point: at the time of this writing, neither of Clang’s [[clang::no_destroy]] nor -fno-c++-static-destructors work for this purpose. GCC and Clang both offer __thread in both C and C++ mode (with the intended code generation semantics).

  • N2934: Revise spelling of keywords (accepted into C23)

    Paper N2850 raised an issue that C’s and C++’s thread local variables are not compatible.
    This is because in C++ the property if a thread local variable needs a constructor (or not) at thread initialization time is not necessarily known in all TU that see a declaration of that variable. C does not have constructors and so it only allows initialization of thread local variables with constant expressions.
    It was thus suggested that implementations that combine both languages may distinguish between thread_local for the C++ variant and _Thread_local for the interoperable C variant. Discussion in the C/C++ compatibility group revealed that even as of today this approach is doomed to fail, because C already has the compatibility macro thread_local in <threads.h>. Thus even as today, implementations combining both languages have to address this problem differently, for example by enhancing their linkers with the knowledge if a thread local variable has a constant expression as initializer. The preferred solution that has recently be discussed with the C – C++ liaison group is to make that choice for the C or C++ ABI by using the established feature of specifying C linkage with extern "C".

For internal C code it doesn't matter if we use _Thread_local or the thread_local macro/keyword. It seems that the future direction is to prefer _Thread_local rather than C23's suggestion of thread_local.

Now what's happening when processing the header with a C++ compiler?

#ifdef __cplusplus
extern "C" _Thread_local int x;
#endif

int f() {
    return x;
}
$ /opt/homebrew/opt/gcc@15/bin/g++-15 -Wall -c test.cpp
test.cpp:2:12: error: '_Thread_local' does not name a type; did you mean 'thread_local'?
    2 | extern "C" _Thread_local int x;
      |            ^~~~~~~~~~~~~
      |            thread_local
test.cpp: In function 'int f()':
test.cpp:6:12: error: 'x' was not declared in this scope
    6 |     return x;
      |            ^
$ /opt/homebrew/opt/llvm@21/bin/clang++ -Wall -c test.cpp # success

To summarise, and I hope I got everything right, we could:

  • stick to _Thread_local in C code; and
  • use _Thread_local in C++ code if the compiler is Clang; and
  • use constinit thread_local in C++20 code; and
  • use thread_local in C++ code if the compiler is G++, with its -fno-extern-tls-init flag, hoping that it doesn't break the end executable if there are other TLS objects (I'm not sure of the implications here);
  • investigate or reject other C++ compilers (namely, MSVC);
  • or hope that we'll be able to use extern "C" thread_local in C2X and C++26 ;)

I have to admit this is a bit hand-wavy.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 6, 2025

What would go wrong if we used _Thread_local everywhere?

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 6, 2025

(Ah, silly question, the answer is in the code: g++ does not support it.)

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 6, 2025

It looks like all of gcc, g++, clang and clang++ support __thread just fine. Maybe we could use that everywhere?

@MisterDA
Copy link
Copy Markdown
Contributor

MisterDA commented Sep 7, 2025

It looks like all of gcc, g++, clang and clang++ support __thread just fine. Maybe we could use that everywhere?

Not for MSVC, but I don't think we should abandon _Thread_local completely. Maybe this?

#if defined(__STDC_VERSION__) || defined(__clang__)
#define CAMLthread_local _Thread_local
#elif defined(_MSC_VER)
#define __declsplec(thread)
#else
#define CAMLthread_local __thread
#endif

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Sep 7, 2025

The current misc.h definition does not use declsplec(thread), why does it work for MSVC? Is it because the recent MSVC version that 5.x require all support _Thread_local as well? What happens (today) when one tries to build OCaml C stubs from the MSVC equivalent of a C++ compiler? (Or how do opam+MSVC systems build ocaml-mccs?)

@MisterDA
Copy link
Copy Markdown
Contributor

MisterDA commented Sep 8, 2025

The current misc.h definition does not use declsplec(thread), why does it work for MSVC? Is it because the recent MSVC version that 5.x require all support _Thread_local as well?

Yes, see C11 Threads in Visual Studio 2022 version 17.8 Preview 2 and __declspec(thread).

What happens (today) when one tries to build OCaml C stubs from the MSVC equivalent of a C++ compiler? (Or how do opam+MSVC systems build ocaml-mccs?)

The blogpost reads:

Unlike C11 atomics, C11 threads do not share an ABI with C++’s <thread> facilities, but C++ programs can include the C11 threads header and call the functions just like any C program. Both are implemented in terms of the primitives provided by Windows, so their usage can be mixed in the same program and on the same thread.

and I'm not sure what to make of that. Can we use either thread_local or __declspec(thread) without fear?

@MisterDA
Copy link
Copy Markdown
Contributor

If the following example correctly models the situation, then we can observe that the same code is generated between C and C⁠+⁠+ compilers to read and access a thread local variable. Here's an example for x86_64 with latest MSVC, Clang, and GCC: https://godbolt.org/z/bqnEhsqz6.

#if defined(__STDC_VERSION__) || defined(__clang__)
#define CAMLthread_local _Thread_local
#elif defined(_MSC_VER)
#define CAMLthread_local __declspec(thread)
#else
#define CAMLthread_local __thread
#endif

#ifdef __cplusplus
extern "C" {
#endif

#if defined(_MSC_VER) && defined(__cplusplus)
[[msvc::no_tls_guard]] 
#endif
extern CAMLthread_local int x;

#ifdef __cplusplus
}
#endif

void set_x(int i) {
    x = i;
}

int get_x() {
    return x;
}

MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Nov 17, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Dec 15, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Dec 15, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Dec 16, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Dec 16, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Dec 18, 2025
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Jan 5, 2026
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Jan 5, 2026
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Jan 5, 2026
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Jan 5, 2026
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Jan 15, 2026
MisterDA added a commit to MisterDA/ocaml that referenced this pull request Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: Full matrix Makes the CI test a bigger set of configurations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[5.2 & 5.3] Compiling C++ code using the OCaml C API results in name-mangled caml_state on Cygwin

5 participants