Skip to content

C++ API compatibility, again#13777

Closed
MisterDA wants to merge 5 commits intoocaml:trunkfrom
MisterDA:cxx-api-compat
Closed

C++ API compatibility, again#13777
MisterDA wants to merge 5 commits intoocaml:trunkfrom
MisterDA:cxx-api-compat

Conversation

@MisterDA
Copy link
Copy Markdown
Contributor

@MisterDA MisterDA commented Feb 4, 2025

@kit-ty-kate reported in #13541:

Trying to compile opam (which uses C⁠+⁠+ through mccs) on Cygwin with the 5.3 branch, fails with:

** Cannot resolve symbols for descriptor object:
 _ZTH10caml_state
 File "caml_startup", line 1:
  Error: Error during linking (exit code 2)

It looks like C⁠+⁠+ name mangling is used on a symbol which should have a C calling convention. This usually signals a missing extern "C" annotation, which is commonly added in C headers as:

#ifdef __cpluplus
extern "C" {
#endif   
/* symbols */
#ifdef __cplusplus
}
#endif

In #13591 we went through all of the public headers and added these guards for all symbols that are not hidden behind #ifdef CAML_INTERNALS.

We also introduced a test, cxx-api, which should have caught incompatibilities with C++ being introduced in the headers, again.

I've just found out that 94fafd4 introduced such a mistake (using a _Atomic uintnat publicly instead of atomic_uintnat from camlatomic.h), that the test should have prevented… Turns out the test was not working. It was never executing the bytecode/native steps and was always succeeding if MSVC wasn't used, and skipped otherwise.

The first step is to fix the test. We want a file built as C++ that includes all the public C headers. The OCaml compiler drivers will call the C compiler driver if given a .c file. We'd like to instruct the C compiler to treat it as C⁠+⁠+, this is done using the -xc++ option.

Some compilers (e.g., Apple clang 16 and earlier) default to an old C⁠+⁠+ standard when given the -xc++ flag, even if they default to C11 or newer. We need -std=c++11 (for atomic support, among other things).

We would use flags = "-ccopt -xc++ -ccopt -stdc++11", but -ccopt <opt> passes the option <opt> to the C compiler and linker. On Windows, the FlexDLL linker is called directly by the OCaml compiler driver, and rightfully rejects -xc++ -std=c++11. On other systems, the linker is not called directly, but through the (C) compiler driver. The compiler driver might then interpret anything that follows these flags, including object files, as C⁠+⁠+.

The solution to this headache is to build the C⁠+⁠+ file directly, without stepping through the OCaml compiler. This is also what Dune does, which doesn't suffer from this problem.

Fixing the test uncovered 3 incompatibilities with C⁠+⁠+:

  1. I introduced in Annotate alloc/free open/close functions with compiler attributes #13537 annotations to various alloc/free functions that help keeping track of allocation provenance and lifetime for static analysis. They take the form of variadic macros for extensibility, to be able to add new parameters without breaking potential users of these macros, external to the runtime. Turns out variadic macros require at least an argument, and this pattern isn't standard (until C⁠+⁠+20). It may not be standard C either. The fix is to remove the variadic part. Perhaps we should state that the macros may be subject to change.

  2. The use of a flexible array member in the bigarray structure. Although valid C99, this was never part of any C++ standard (but is a common extension). There is apparently no UB-free way of expressing a FAM in C⁠+⁠+, even with 0‑sized or 1‑sized arrays ending the structure.

  3. changes-meaning errors: in struct addrmap { value key; value value};, the field value hides the value typedef.

The C⁠+⁠+ API compatibility test is now best-effort. Assuming a modern GCC compiler, it'll try to build a simple stub as C⁠+⁠+, including all the public headers. It will ignore errors FAM and changes-meaning errors, and report any new errors. Using GCC JSON reporting makes it easier to filter errors. The test doesn't cover Windows targets or clang errors.

The test can be seen succeeding at failing at MisterDA#119. As the test was incorrect, this suggests that tests exhibiting bug fixes could be added first in history, marked with expected failure, and the commit fixing the bug would remove the expected failure as the test should succeed.

Now back to the root Cygwin cause: the failure looked like a problem with name mangling. The demangled symbol is TLS init function for caml_state. I've fought with the compiler to try and understand the error, but it seems that Cygwin, just as MinGW-w64 and macOS targets, needs an indirection to be able to access thread-local storage variables from DLLs. This fixes the issue.

If accepted, the Cygwin commit and the test should be backported to the 5.3 branch.

@MisterDA
Copy link
Copy Markdown
Contributor Author

MisterDA commented Feb 5, 2025

I think this PR is too big for its own good, I'll split it.

@MisterDA MisterDA marked this pull request as draft February 5, 2025 14:09
@xavierleroy
Copy link
Copy Markdown
Contributor

changes-meaning errors: in struct addrmap { value key; value value};, the field value hides the value typedef.

Funny. I tend to agree with the C++ compiler that this C code is needlessly confusing. Naming the field "val" would be more symmetrical with "key", in any case.

_Atomic is standard C11 but is not standard C++ (although sometimes
available as a compiler extension). Use atomic_uintnat from our
camlatomic.h.

    In file included from all-includes.h:6:
    /Users/antonin/Tarides/ocaml/trunk/runtime/caml/custom.h:54:12: error: '_Atomic' does not name a type; did you mean 'Atom'?
       54 | CAMLextern _Atomic uintnat caml_custom_major_ratio;
          |            ^~~~~~~
          |            Atom
Varadic macros expect at least one parameter for the variadic part (at
least until C++20). They were used to allow for possible future
extension by adding more parameters.

    /Users/antonin/Tarides/ocaml/trunk/runtime/caml/memory.h:89:32: error: ISO C++11 requires at least one argument for the "..." in a variadic macro [-Werror]
       89 | CAMLmalloc(caml_stat_free, 1, 1) CAMLreturns_nonnull()
          |                                ^
This causes name-mangling issues when linking. Use an intermediate
function to access the caml state.
This avoids the field shadowing the `typedef x value` type in
C++ (changes-meaning error).
@MisterDA
Copy link
Copy Markdown
Contributor Author

MisterDA commented Sep 5, 2025

I'll try revisiting the C++ test, in the meantime let's close this as the issue has been resolved.

@MisterDA MisterDA closed this Sep 5, 2025
@gasche
Copy link
Copy Markdown
Member

gasche commented Sep 5, 2025

I do agree with you that having a CI test that our headers are supported by C++ compilers would also be useful to avoid such regressions in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants