Skip to content

Unify the two executable header implementations#13988

Merged
nojb merged 20 commits intoocaml:trunkfrom
dra27:unified-header
Jun 28, 2025
Merged

Unify the two executable header implementations#13988
nojb merged 20 commits intoocaml:trunkfrom
dra27:unified-header

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Apr 25, 2025

On Unix systems, bytecode executables by default are shebang "scripts" where the #! points to ocamlrun (or a trivial sh script which exec's ocamlrun). However, even on Unix not all systems have supported shebang scripts, and there has always been an alternative executable stub, written in C, living in stdlib/header.c. On Windows, where the shebang mechanism is never available, the executable is always required. However, the code is different and it lives a separate life in stdlib/headernt.c. Today, all the Unix systems which can run OCaml also support shebang scripts but, mainly for historical reasons, the Unix version of the executable header remains because the Cygwin port of OCaml (which is a Unix-like clone running on top of Windows) uses it.

Relocatable OCaml changes the mechanism by which these normal bytecode executables find ocamlrun. This primary goal of this PR is to unify the common parts of header.c and headernt.c because the common code between the two is much larger in Relocatable OCaml. Along the way, there are a few minor differences of behaviour / bugs which can be addressed.

The final commit then merges headernt.c into header.c. This last part is not strictly necessary, but I propose is a good idea. It creates the following structure:

/* header.c */

/* "Header" - platform-specific code to support "Common" */
#ifdef _WIN32
/* Windows-specific stuff */
#else
/* Unix-specific stuff */
#endif

/* "Common" - functions shared with both implementations */

/* "Main" - the header entry point */*
#ifdef _WIN32

_Noreturn void __cdecl wmainCRTStartup(void)
{ /* ... */ }

#else

int main(int argc, char *argv[])
{ /* ... */ }

#endif

In this PR, the "Header" is quite large for both implementations, "Common" is fairly small and "Main" is a similar size for each. With Relocatable OCaml, and with some additional fixes I propose afterwards, "Common" becomes much larger and the Unix part "Header" becomes very small. I think it will be slightly easier to maintain just stdlib/header.c rather than adding a third stdlib/header_common.c.

In terms of review, I suggest that it's worth reviewing the final form of stdlib/header.c as it ends up (i.e. as if it were a new file, rather than a diff) and then viewing the commits as the pathway from the current state to that. The commits along the way:

  • Deal with a minor typo from #12751 affecting recompilations of the header
  • Deal with a regression of 1bbd7c9 in #1200 - the switch to use wmain as the entrypoint caused a considerable increase in size of the executable header, which was the purpose of the original patch. On my laptop, the Linux executable header (in Ubuntu 24.04) is ~14KiB and the Cygwin header is about 10KiB (just for comparison). With the invocations corrected, the x64 mingw-w64 header reduces by 12KiB to 5KiB (x86 by 9.5KiB to 4.5KiB) and the MSVC x64 header by 8.5KiB to 4KiB (x86 by 7KiB to 3.5KiB)
  • I have observed problems with stripping the MSVC header (which was enabled in #13282) - in this instance it was reducing it in size slightly and corrupting it in the process
  • By including <stdint.h> in caml/exec.h, caml/exec.h becomes "standalone" (which is useful in further work not to leak lots of other things from caml/mlvalues.h)
  • Technically, the Unix version of the executable header was leaking a file descriptor across the exec call
  • Although the RNTM section is not normally given a relative PATH, by switching from execv to execvp, we have the same semantics for the call between header.c and headernt.c

For information, the work I wish to stack on top of this:

  • Relocatable OCaml updates code shared between the two C files, which is the motivation for wanting to unify the common portions of both headers first
  • The Unix version of the header (which is actually used by the Cygwin port) will benefit from having access to caml_executable_name - I propose sharing all the PATH-searching code with runtime/unix.c, especially as it's duplicated inconsistently at the moment
  • I wish to address this long-standing piece of weirdness (what's going on is left as an exercise to the reader...):
C:\Users\DRA>ocamlopt.byte -vnum
5.3.0

C:\Users\DRA>ocamlopt.byte.exe
No input files

C:\Users\DRA>ocamlopt.byte
no bytecode file specified

and the fix for that similarly involves changes in both header.c and headernt.c otherwise.

@MisterDA
Copy link
Copy Markdown
Contributor

_Noreturn is deprecated in C23, which GCC 15 and the upcoming clang will default to. I haven't tested if they warn about it, or what we're to expect in term of compatibility, but you could consider using a wrapper macro à la CAMLnoret. (Don't use __has_c_attribute or __has_attribute to check for support of the C23 attribute syntax as they also report true in C11/C17 mode).

#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 202311L
[[noreturn]]
#else
_Noreturn
#endif

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Apr 28, 2025

Hah - I'd forgotten that C11 giveth and C23 taketh away! I've updated that commit to define a NORETURN instead (the motivation to keep the caml headers out of the equation is because the header is being linked with out a C standard library on mingw-w64, and it turns out it's very easy for them to creep back in when we start including the runtime headers)

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Apr 30, 2025

@MisterDA - are you able/willing to do a full review?

Copy link
Copy Markdown
Contributor

@MisterDA MisterDA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the review commit-by-commit helps.

A few minor comments on top of the inline ones:

  • I'd prefer seeing perror and strerror_r over ad-hoc error messages in the unix path;
  • read_size is fishy, I'm not sure if the size is stored in big or little endian (I'm guessing LE) but I worry there could be a possible mismatch if the endianness of the host and what's stored in the binary differ. If they can never differ, perhaps a memcpy is just as clear (that is, if memcpy is even allowed).

ENTRYPOINT = wmainCRTStartup
endif
tmpheader.exe: OC_LDFLAGS += \
-nostdlib -nostartfiles -Wl,--gc-sections -Wl,-e,$(ENTRYPOINT)
Copy link
Copy Markdown
Contributor

@MisterDA MisterDA Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference: -nostdlib and -nostartfiles. I understand from the documentation that -nostdlib implies -nostartfiles. There's also this note:

In most cases, you need libgcc.a even when you want to avoid other standard libraries. In other words, when you specify -nostdlib or -nodefaultlibs you should usually specify -lgcc as well. This ensures that you have no unresolved references to internal GCC library subroutines.

If the program runs without -lgcc I suppose there's no harm. It seems to be needed for C++ and initializing data.

For reference on --gc-sections:

Once you’ve created the objects and static libraries with these switches, the linker can perform the dead code elimination. You can do this by specifying the -Wl,--gc-sections switch to your gcc command [...]. This causes the linker to perform a garbage collection and remove code and data that are never referenced.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm pretty sure that I just added those while both without noticing that -nostdlib implies -nostartfiles - it seems desirable to be explicit, even if I probably did that by mistake?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I'm very intentionally not passing -lgcc - the code (including in Relocatable) is moderately carefully crafted not to require -lgcc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. clang lists -nostartfiles and -nostdlib but doesn't document them. I agree with you that it's preferable to be explicit and keep the two flags.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a little more comment above the recipe (and the links will remain in the PR, obviously, thanks!)

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented May 2, 2025

Thanks, @MisterDA!

  • I'd prefer seeing perror and strerror_r over ad-hoc error messages in the unix path;

This changes the behaviour, which is beyond the scope of what's in this PR. The original function is:

static void errwrite(const char * msg)
{
  fputs(msg, stderr);
}
  • read_size is fishy, I'm not sure if the size is stored in big or little endian (I'm guessing LE) but I worry there could be a possible mismatch if the endianness of the host and what's stored in the binary differ. If they can never differ, perhaps a memcpy is just as clear (that is, if memcpy is even allowed).

Again, can we stick to the PR? I don't believe I've materially changed this function? read_size is decoding a big endian (first byte shifted left 24...) which is the result of Stdlib.output_binary_int (which is defined to be big-endian).

@dra27 dra27 force-pushed the unified-header branch from d14166b to 13a4cd1 Compare May 2, 2025 09:29
@MisterDA
Copy link
Copy Markdown
Contributor

MisterDA commented May 2, 2025

Thanks for the explanations. The PR looks good to me, all the changes make sense to me on their own and taken together.

@dra27 dra27 force-pushed the unified-header branch from 13a4cd1 to f17db87 Compare May 2, 2025 13:21
@dra27 dra27 force-pushed the unified-header branch 2 times, most recently from 7d5b8de to 5e40770 Compare June 25, 2025 11:02
dra27 and others added 16 commits June 27, 2025 14:40
Affects forced recompilation of runtime-launch-info
stdlib/headernt.c was adapted in OCaml 3.00 to reduce its size by
avoiding the use of the CRT and using Windows API functions directly
(this is a well-studied trick on Windows, principally as a puzzle for
producing tiny binaries).

This got "regressed" slightly in OCaml 4.06, in the complex introduction
of wide character support for Windows, as the mingw-w64 incantation
required was unclear, so the entry point was changed to wmain, and the
size of the header increased.

By switching from wcslen (a CRT function) to lstrlen (a Win32 API
function), headernt.c again only requires kernel32.dll.

Additional flags are added for both ld (mingw-w64) and link (MSVC) to
squeeze every last byte out of tmpheader.exe. The MSVC version of the
header is once again no longer passed through strip, as this was found
to be corrupting the executable (and had never been reducing its size
anyway).
Modern C compilers are sufficiently intelligent not to need the inlining
hints!

Inline the definition from caml/misc.h for CAMLnoret.

Co-authored-by: Antonin Décimo <antonin@tarides.com>
Already updated to remove the actual test in s.h, since XPG1 (1985)
required it and it is therefore part of the Single Unix
Specification (1992), but the _WIN32 guard and the loading of s.h are
unnecessary.
Defined in sys/stat.h and unistd.h respectively by the Single Unix
Specification.
O_BINARY is added for maximum Cygwin compatibility, but it's not a Posix
flag.
Have exec.h include <stdint.h> itself. The bytecode executable header
now only depends on exec.h.
Modernisations applied in headernt.c but not applied to header.c - use
uint32_t rather than unsigned long
Brings the behaviour of the two headers into line with each other - if
RNTM is ocamlrun on Unix, ocamlrun will now be sought in PATH.
Copy link
Copy Markdown
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look, and I trust @MisterDA's review.

@damiendoligez
Copy link
Copy Markdown
Member

@dra27 I'm assuming you don't want the commits squashed?

@nojb nojb merged commit 076b435 into ocaml:trunk Jun 28, 2025
31 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants