Skip to content

Relocatable OCaml - Searching and Suffixing#14245

Merged
nojb merged 23 commits intoocaml:trunkfrom
dra27:runtime-searching
Dec 12, 2025
Merged

Relocatable OCaml - Searching and Suffixing#14245
nojb merged 23 commits intoocaml:trunkfrom
dra27:runtime-searching

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Sep 15, 2025

This is the third of three PRs which implement Relocatable OCaml as proposed in ocaml/RFCs#53. Bytecode executables (including those produced when building the compiler distribution itself) usually contain an absolute path for the location of ocamlrun, which is incompatible with Relocatable OCaml. The patches here provide an alternate mechanism for these executables to find the interpreter without needing its absolute location. This change is combined with a name mangling scheme which is used both for the bytecode interpreter executables' filenames and holistically to fix long-standing issues with the naming of shared libraries (both the shared runtime libraries and shared bytecode C stub libraries). Together, the patches address section 2 of the RFC.

There are several mechanisms for linking bytecode executables. This PR is exclusively concerned with standalone bytecode executables, which are those where the compiled bytecode image is prefixed with a launcher, but not with the OCaml runtime interpreter itself. This launcher can be a simple "shebang" line (e.g. #!/path/to/ocamlrun) or a small executable, compiled from stdlib/header.c. In this case, "standalone" refers to the image being in a separate from the runtime, rather than that the executable itself is standalone. There are situations where the interpreter itself cannot be used in a shebang line, and in this case, the compiler today instead emits a tiny shell script using #!/bin/sh as the interpreter.

Windows does not support shebang executables, always using the executable stub. In order to assist the old binary distributions of Windows OCaml, the Windows version of the executable stub has always performed a PATH-search for ocamlrun. However, this was done at a time where it was expected that a user would have a single installation of a single version of OCaml on their system, which is no longer true. It is very much the case today that an ocamlrun in PATH has little to no guarantee of being the ocamlrun required by a given bytecode executable on a user's machine. Therefore, a name mangling scheme is also proposed - i.e. we increase the ways in which a bytecode executable may seek to find its runtime, but by refining the name of the file it's searching for, we increase the chances of finding the correct binary and of multiple installations of OCaml not interfering with each other. Fundamentally, this simply means that two different versions of OCaml (either a different release, or a relevantly different configuration) have different names for the bytecode interpreter. There are two crucial consequences to this: the error messages when things do go wrong are much better (being along the lines of "I can't find an interpreter for OCaml 5.5" rather than "bad magic number", "symbol not found" or just a segfault) and it also stops things from "silently working" and then suddenly failing one day because a release of OCaml happened to add a new function to the Unix library. This name mangling scheme is likewise applied to the shared libraries loaded by the interpreter.

The key changes are:

  • A new command line option for ocamlc, -launch-method, allows dynamic selection of either the shebang (#!/usr/bin/ocamlrun) or executable-stub launcher for standalone bytecode executables. This option allows the metadata in the runtime-launch-info file to be removed.
  • A new command line option for ocamlc, -runtime-search, allows a new mechanism to be specified for the header of standalone bytecode executables where instead of only executing ocamlrun from a fixed location, they are instead able to search for it.
  • A name mangling scheme is introduced to be used for shared libraries (both the shared library versions of the OCaml runtime and also for bytecode C stub shared libraries) and the bytecode interpreter executables (ocamlrun, etc.).
  • A new pair of command line options for ocamlmklib, -suffixed and -no-suffixed, and a new command line option allows the metadata in the runtime-launch-info file to be removed for ocamlc, -dllib-suffixed, provide a mechanism for using this name mangling scheme for bytecode C stub shared libraries. This mechanism is transparent to the user, for example #use "unix.cma" continues to work in the toplevel, but the interpreter executing the toplevel (i.e. ocamlrun) searches for a DLL based on its configuration.
  • A new configure option, --enable-suffixing, which is enabled by default uses this name mangling scheme for the bytecode interpreter executables and the shared library versions of the OCaml runtime. In particular, this means that the bin directory of two different OCaml compilers may appear in PATH and the lib/ocaml directory of two different OCaml runtimes in LD_LIBRARY_PATH but executables compiled for either of those versions of OCaml continue to load correctly.
  • A new pair of configure options, --enable-runtime-search and --enable-runtime-search-target, control how the bytecode executables of the compiler distribution and those produced by the compiler distribution respectively search for the runtime. In particular, --enable-runtime-search[=always] builds a compiler whose bytecode executables will continue to work correctly if the compiler is moved or copied to a new location after installation.

The commit series is in three phases:

Searching

The goal of the commits in this first phase of the series is to provide the ability to have the launcher not require the absolute location of the interpreter. Fundamentally, this involves extending both stdlib/header.c and the sh-script produced by bytecomp/bytelink.ml. As with -set-runtime-default in #14244, this is a facility which is needed during by some user executables (in particular, any bytecode executables installed in a "relocatable" opam switch) but not by others. This is a subtlety which I missed in the original implementation of this in 2021, which added the configuration to the runtime-launch-info file, providing the ability to build the compiler distribution with relocatable bytecode binaries (by setting stdlib/runtime-launch-info appropriately), but forcing executables produced by that compiler distribution either to be all relocatable or all not relocatable (by setting stdlib/target_runtime-launch-info). There is therefore a clear need for a command line option to select the search mode of the launcher.

Additionally, with most of the changes in this PR needing to be made equivalently between bytecomp/bytelink.ml (in POSIX Shell Command Language) and stdlib/header.c, it's desirable to be able to test executables produced with both the shebang launcher and executable launcher on the same system, but the only way to control this option is by changing runtime-launch-info file. It'd be just about acceptable to have to do that for the test harness, but it hints at the desirability for a command line option to select between the shebang launcher and executable launcher.

Having accepted that a command line option is needed to control the search mode of the launcher, it then seems strange to be encoding a default value for it in stdlib/target_runtime-launch-info, rather than in the Config module. Similarly, having accepted the addition of a command line option to select between shebang/executable for the launcher, it made me revisit the design in #12751 for runtime-launch-info. In addition to the launcher kind (shebang/executable), runtime-launch-info also contains the configured installation location of the binaries (which, for the installed runtime-launch-info file will also match Config.bindir in ocamlcommon) and the executable launcher itself. Given that both launcher kind and search mode are proposed to be conveyable by command line option, it seems to me to be sensible to add a command line mechanism to convey the location of the runtime interpreter executables to ocamlc and change runtime-launch-info to be just the compiled executable from stdlib/header.c, with the default values for launcher kind, search mode and binary directory residing where they belong in the Config module. This makes runtime-launch-info a cross-compilation concern only, and eliminates any difference between boot/ocamlc and ./ocamlc during the build. The only caveat is that when linking we must always be explicit about the launcher kind, Config.bindir and the search mode because boot/ocamlc cannot have defaults for these. I think using command line options this way is not only simpler than the #12751's use of runtime-launch-info but is semantically simpler than the camlheader files which it replaced. That change is therefore made as part of this commit series, but the explanation is here to motivate it why it has been done and also to make it clear that it's a necessary change to make, especially as the older implementation not doing it this way still exists. The underlying principle is that runtime-launch-info contains only things are which properties of the library around it and not things which can be altered when the compiler is invoked (i.e. stdlib/header.c has been compiled for a given configuration).

  • The first three commits are not strictly related, but fall in areas which are updated by this PR - somewhat hilariously, it turns out that Construct the bytecode executable header in ocamlc #12751's logic for finding sh is incorrect on Solaris. An indentation error of a large part of a function in Relocatable OCaml - test harness #14014 slipped through but, more importantly, the error reporting in the bytecode binaries test left something to be desired - especially given that the aforementioned Solaris problem triggered an failure in this test.
  • Next is a largely mechanical simplification to some logic in bytecomp/bytelink.ml. Previously, -use-runtime and -runtime-variant were processed first yielding a boolean use_runtime which indicated whether -use-runtime had been specified and a value for runtime. If -use-runtime was not specified and the compilation is not for Windows, then the runtime value is appended to the configured to the location for the interpreters specified in runtime-launch-info i.e. "ocamlrun" computed in the first check becomes "/usr/local/bin/ocamlrun" in the second step on Unix, but remains "ocamlrun" on Windows. However, if -use-runtime is specified, then the value is unaltered. This is all a bit obtuse (and I think probably my fault originally...), and it's much clearer to combine these.
  • Next, the build system is extended to support single quotes in --prefix. If someone can, they probably will (in this case, it was useful to test that the various de-quoting functions needed in the harness would not trip over single quotes in the values themselves).
  • ocamlobjinfo is extended to display information on the runtime of a standalone bytecode executable (the tools/ocamlsize Perl script already has this ability). Given the various changes with name mangling, this seems a very pertinent piece of information to be able to obtain. For the executable launcher, this is simply the content of the RNTM bytecode section. For the shebang launcher, it has to be read from the shebang or shell script. Code to do this is already present in the test harness from Relocatable OCaml - test harness #14014, but because this parsing gets more complex with subsequent changes, it's instead rewritten as a lexer with the new function Byterntm.read_runtime.
  • -launch-method is added to ocamlc and then used in the in-prefix tests so that systems which support shebang scripts test both the executable launcher and the shebang launcher. The option takes a single parameter which exactly corresponds to the first line of runtime-launch-info.
  • -launch-method is then extended to support an additional part of the argument specifying the directory containing the binaries so that -launch-method 'sh /home/user/bin' simultaneously conveys both that a shebang launcher is to be used and that the interpreters reside in /home/user/bin. Once bootstrapped, -launch-method can now be used in boot/ocamlc to remove the need for the first two lines of boot/runtime-launch-info. Config.target_bindir and Config.launch_method contain the two lines otherwise added to stdlib/target_runtime-launch-info. The rest is plumbing, with the removal of all the parsing code for runtime-launch-info and a certain amount of temporary plumbing to cope with whether boot/ocamlc has been bootstrapped or not.
  • -runtime-search is then implemented, which provides three modes for locating the bytecode interpreter. disable is the existing behaviour, and requires the runtime to be located at the absolute location the compiler was configured with. always instead first looks in the directory containing the bytecode executable itself and then searches PATH, if necessary. In particular, binaries installed in an opam switch's bin directory always use the runtime in that switch's bin directory. Finally, enable provides a hybrid of both approaches where the bytecode executable looks for the runtime in the absolute location the compiler was configured with (the disable mode), then looks in the directory containing the bytecode executable, then searches in PATH (the always mode). The -launch-method and -runtime-search options together make it trivial to test all 6 combinations in the test harness. At this stage, Windows defaults to -runtime-search always where Unix defaults to -runtime-search disable, which just about corresponds to behaviour of the Windows executable launcher, with one slight improvement. Previously, the runtime (e.g. ocamlrun) was searched in Path using SearchPathW. The new behaviour first checks the directory containing the bytecode executable, which is a general usability improvement (this would have been useful, for example, when the change was originally made, as it would have removed the need to put C:\Program Files\OCaml\bin into Path, for example) and also brings the search behaviour more into line with LoadLibraryW(https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryw) which, when loading a DLL, first looks in the directory containing the executable.

Suffixing

With the compiler now able to search for its runtimes, the next commits introduce an approach to name mangling and use it to suffix the filenames of the bytecode interpreters, shared runtime libraries and shared bytecode stub libraries.

I have attempted to document the mangling scheme in-tree in runtime/Mangling.md. In summary:

  • We have various bits of information - each bit is either a property of the distribution (e.g. its version) or a property of its particular configuration (e.g. --enable-flambda, disable-flat-float-array, --without-zstd, etc.)
  • The configuration bits can affect a combination of the native runtime, the bytecode runtime and the bytecode interpreter
  • The interpreter executables are found using just the bits that affect reading that bytecode image (version, marshalled compression, int63 constants, shared libraries, etc.)
  • Each particular ocamlrun interpreter then loads .so based on the bits affecting the bytecode runtime - which includes bits which do not affect the execution of the bytecode itself. In particular, .so files are mangled with the machine triplet

The sequence:

  • The first commit simply sets up the infrastructure for these IDs.
  • The ID is then used to mangle the bytecode interpreter filenames, with --disable-suffixing added to configure to keep the existing behaviour. ocamlrun is now installed as xxxx-ocamlrun-bbbb where xxxx is the triplet that the runtime executes on and bbbb is the Bytecode Runtime ID. This executable is symlinked as ocamlrun and also as ocamlrun-zzzz where zzzz is the Zinc Runtime ID. When the tree is configured with --enable-suffixing, bytecomp/bytelink.ml uses ocamlrun-zzzz rather than ocamlrun when determining the runtime name. Note that -use-runtime is unaffected - if the runtime to use is explicitly stated then it overrides the name mangling. The information for the Zinc Runtime ID is split into two halves - the low bits, consisting of the release number and, intentionally, bits which are always zero, are universal and so come from Config (these are the first two characters). The high bits, which are configuration-specific, are put in runtime-launch-info (recall from the earlier stripping of information - this is information which cannot be dynamically changed when invoking the compiler) and merged by bytecomp/bytelink.ml.
  • libasmrun.so and libcamlrun.so then get the same treatment, becoming libasmrun-xxxx-bbbb.so and libcamlrun-xxxx-nnnn.so with a symlink created with the original unmangled name. The use of -runtime-variant for switching enabling the shared runtime has been something of a hack since it was originally added, but that's for another day - for now, both asmcomp/asmlink.ml and bytecomp/bytelink.ml recognise -runtime-variant "_shared" correctly mangle the name.
  • Finally, the scheme is extended to the bytecode C stub libraries so that dllunixbyt.so becomes dllunixbyt-xxxx-bbbb.so. The implementation is mostly mechanical. Although cma format is updated, note that the bootstrap can be delayed because boot/ocamlc is, by definition, only ever passed cma files which have lib_dllibs = [], so it doesn't matter that boot/ocamlc has the wrong "type", because it will never see a list.

Bootstrap and utilisation

The final phase of the commits allows the compiler to use all of these features. Each one of the introduced changes in the second phases notionally requires a bootstrap, but the commit series has been organised such that only a single bootstrap is required, with a series of elegant workaroundsgross hacks then removed in the following commit. Finally, the plumbing to implement --enable-runtime-search and --enable-runtime-search-target is available, along with the updates to the tests. Note that in the interests of sanity, but not for any particular implementation reasons, --enable-runtime-search and --enable-runtime-search-target both require --enable-suffixing (i.e. the escape hatch is there to allow all of this to be disabled, but it must then all be disabled).

@dra27 dra27 added run-crosscompiler-tests Makes the CI run the Cross compilers test workflow CI: Full matrix Makes the CI test a bigger set of configurations labels Sep 15, 2025
@dra27 dra27 added the relocatable towards a relocatable compiler label Sep 15, 2025
Copy link
Copy Markdown
Contributor

@shym shym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won’t have more time this week to finish my review of this PR, I reviewed all the commits up to the bootstrap. It’s, again, a very impressive piece of work, congratulations!
I think I dug deeper than what I did for 14244, or maybe it’s just that I needed less context to grasp what was happening. And it always made perfect sense.

Apart from the following, my remarks are attached to the corresponding lines.

  • 130074e commit message doesn’t do that commit justice: it contains a whole lot more than just a configure update (which is pretty nice, by the way :-D)

Comment on lines +448 to +451
| Config.Executable, bindir ->
Executable, bindir
| Config.Shebang sh, bindir ->
Shebang_bin_sh (Option.value ~default:"sh" sh), bindir
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the documentation for launch_method in Config.mli, I expected to find a case that would yield Shebang_runtime. Is that not the case because, when -launch-method is not specified, Shebang_runtime is the default launch method when possible (which is how I understand the logic in write_header)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is legacy from #12751 (i.e. that -launch-method is just replacing some of the information which was read from the header before). Before, runtime-launch-info was parsed early in the process, mainly to report errors as soon as possible, but that doesn't matter now - I think this block can be folded into the logic at L472 and will consequently look a lot less unclear, yes! The evil part is the wildcard pattern in L488-489:

        | _ ->
            Executable

which subtly encodes the fallback to the executable header if sh can't be found when needed.

utils/config.mli Outdated
Comment on lines +349 to +352
| Shebang of string option (** Use shebang-style launcher, either directly to
the runtime, or via sh. The parameter if
specified is the full path to sh, otherwise the
linker searches for it. *)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the documentation be updated to specify when the runtime will be used? (Related to my comment on bytelink.ml)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see what you mean - I guess I was thinking of it in terms of just selecting between executable / shebang where shebang has a little bit of a extra metadata (the way to find sh, if it's needed) but it makes sense to make it explicit that that's what's used when the path to the runtime itself is not valid in a shebang.

utils/config.mli Outdated
Comment on lines +362 to +366
type search_method =
| Absolute (** Check fixed absolute location only *)
| Absolute_then_search (** Check fixed absolute location, but perform a search
if that fails *)
| Search (** Always search for the interpreter *)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my recurring fixation with names, I’d suggest to pick only one name in each of the pairs Config vs command-line:

  • Config.search_method vs -runtime-search
  • Config.Absolute vs disable
  • etc.

An alternate proposition: type runtime_search = Always | Never | Lazily.

I would let the documentation in Config (and the commandline help, as far as it must be) expand what it all means.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixation is most helpful, thank you! For the constructors vs the command line parameter, I think I was going with the idea that the name (enable, disable, fallback) specified user intent where the constructor (Absolute, Absolute_then_search, Search) specified implementation. That possibly made more sense when this was being read from runtime-launch-info and all the same file, but I guess it does make sense split across command-line parsing, Config, and so forth to be using one name.

But in which case, is using the CLI names OK - i.e. is -runtime-search enable|disable|fallback OK, and use those to drive the names of and in the type? (I prefer fallback to lazily, FWIW, though as ever I'm not super attached to names!)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in which case, is using the CLI names OK - i.e. is -runtime-search enable|disable|fallback OK, and use those to drive the names of and in the type?

Yes, I think that would be nice.

(I prefer fallback to lazily, FWIW, though as ever I'm not super attached to names!)

Yes, fallback is a good choice. Lazily is maybe a bit too λ-calculus-y 😅


(* Writes the shell script version of the bytecode launcher to outchan *)
let write_sh_launcher outchan bin_sh bindir search runtime =
let open struct type tag = D | A | E end in
Copy link
Copy Markdown
Contributor

@shym shym Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_sh_launcher is pretty fun to read, even though I must say I’m not entirely convinced that to keep the scripts factorised that way will be helpful for maintenance (with the l function and the extra comment, it might even run longer?). I would personally unshare at least D from A and E.

Anyway, changing the tags to

Suggested change
let open struct type tag = D | A | E end in
let open struct type tag = DAE | AE | E end in

would have helped me decode the meaning of the tags and the generation code, I think.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can possibly dig out one of the old commits where these were separated - there was an intermediate state where they got updated a second time, and it was really unclear (that's why I ended up rewriting it this way).

I wasn't clear what you mean by "it might even run longer" - the code's physically longer, or it's slower? (I'm not sure I'm bothered in either case - the main thing was to be able to physically see the entire shells script).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revised constructor names are a great idea, ta!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't clear what you mean by "it might even run longer" - the code's physically longer, or it's slower?

I meant physically longer, because to share the D’s 2 lines you need to define exec.
(I don’t expect anything noticeable regarding speed there)

Comment on lines +340 to +344
if tag = D || tag = A && search <> Config.Absolute
|| tag = E && search = Config.Absolute_then_search then begin
output_string outchan (String.trim s);
output_char outchan '\n'
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it easier to read that way, I think:

Suggested change
if tag = D || tag = A && search <> Config.Absolute
|| tag = E && search = Config.Absolute_then_search then begin
output_string outchan (String.trim s);
output_char outchan '\n'
end
match (tag, search) with
| D, _
| A, (Search | Absolute_then_search)
| E, Absolute_then_search ->
output_string stdout (String.trim s);
output_char stdout '\n'
| _ -> ()

(especially if the tag and search constructors were renamed to match)

Makefile Outdated
Comment on lines +2744 to +2751
runtime/$(1)$(EXE) \
"$(INSTALL_BINDIR)/$(call MANGLE_RUNTIME_NAME,$(1))"
ifeq "$(SUFFIXING)" "true"
cd "$(INSTALL_BINDIR)" && \
$(LN) "$(TARGET)-$(1)-$(BYTECODE_RUNTIME_ID)$(EXE)" "$(1)$(EXE)"
cd "$(INSTALL_BINDIR)" && \
$(LN) "$(TARGET)-$(1)-$(BYTECODE_RUNTIME_ID)$(EXE)" \
"$(1)-$(ZINC_RUNTIME_ID)$(EXE)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of lines are not starting with a Tab. I didn’t know make was accepting continuation lines not starting with Tab, but that breaks visual indentation for me. Did you mean to write:

Suggested change
runtime/$(1)$(EXE) \
"$(INSTALL_BINDIR)/$(call MANGLE_RUNTIME_NAME,$(1))"
ifeq "$(SUFFIXING)" "true"
cd "$(INSTALL_BINDIR)" && \
$(LN) "$(TARGET)-$(1)-$(BYTECODE_RUNTIME_ID)$(EXE)" "$(1)$(EXE)"
cd "$(INSTALL_BINDIR)" && \
$(LN) "$(TARGET)-$(1)-$(BYTECODE_RUNTIME_ID)$(EXE)" \
"$(1)-$(ZINC_RUNTIME_ID)$(EXE)"
runtime/$(1)$(EXE) \
"$(INSTALL_BINDIR)/$(call MANGLE_RUNTIME_NAME,$(1))"
ifeq "$(SUFFIXING)" "true"
cd "$(INSTALL_BINDIR)" && \
$(LN) "$(TARGET)-$(1)-$(BYTECODE_RUNTIME_ID)$(EXE)" "$(1)$(EXE)"
cd "$(INSTALL_BINDIR)" && \
$(LN) "$(TARGET)-$(1)-$(BYTECODE_RUNTIME_ID)$(EXE)" \
"$(1)-$(ZINC_RUNTIME_ID)$(EXE)"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was intentional, indeed - in fact, I think check-typo might reject your suggestion? The tab is only required at the start of the recipe line itself. It should work if tabstop is set to 2 😉

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking again, it may have been intentional, but it's inconsistent! I've corrected them 🫣

stdlib/Makefile Outdated
Comment on lines +89 to +90
@{ printf '$(if $(MANGLING),$(ZINC_RUNTIME_ID_HI),\000)'; \
cat $^; } > $@
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another missing Tab.

Suggested change
@{ printf '$(if $(MANGLING),$(ZINC_RUNTIME_ID_HI),\000)'; \
cat $^; } > $@
@{ printf '$(if $(MANGLING),$(ZINC_RUNTIME_ID_HI),\000)'; \
cat $^; } > $@

There’s another in 3a625a9, but it vanishes post-bootstrap.

Makefile Outdated
Comment on lines +2761 to +2766
runtime/lib$(1)_shared$(EXT_DLL) \
"$(INSTALL_LIBDIR)/$(call MANGLE_RUNTIME_DLL_NAME,$(1),$(2))"
ifeq "$(SUFFIXING)" "true"
cd "$(INSTALL_LIBDIR)" && \
$(LN) "$(call MANGLE_RUNTIME_DLL_NAME,$(1),$(2))" \
"lib$(1)_shared$(EXT_DLL)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other missing Tabs.

Suggested change
runtime/lib$(1)_shared$(EXT_DLL) \
"$(INSTALL_LIBDIR)/$(call MANGLE_RUNTIME_DLL_NAME,$(1),$(2))"
ifeq "$(SUFFIXING)" "true"
cd "$(INSTALL_LIBDIR)" && \
$(LN) "$(call MANGLE_RUNTIME_DLL_NAME,$(1),$(2))" \
"lib$(1)_shared$(EXT_DLL)"
runtime/lib$(1)_shared$(EXT_DLL) \
"$(INSTALL_LIBDIR)/$(call MANGLE_RUNTIME_DLL_NAME,$(1),$(2))"
ifeq "$(SUFFIXING)" "true"
cd "$(INSTALL_LIBDIR)" && \
$(LN) "$(call MANGLE_RUNTIME_DLL_NAME,$(1),$(2))" \
"lib$(1)_shared$(EXT_DLL)"

if Config.suffixing then
[Misc.RuntimeID.shared_runtime Sys.Native]
else
["-lasmrun_shared"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is intentionally not Load_path.find "libasmrun..." (I miss context to know whether that could make any difference but it seems to bring the asmlink closer to the bytelink).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Bytelink and Asmlink are linking the _shared variant in the same way (with -l) - I've left well alone on the difference in load_path stuff, just because the history of why they're that way wasn't totally clear to me (and _shared is an abuse of -runtime-variant that I'd like to fix some day... static/shared linking should be an option for all the runtime variants!)

bytecomp/dll.mli Outdated
Comment on lines +18 to +19
(* Extract the name of a DLLs from its external name (xxx.so or -lxxx) *)
val extract_dll_name: string -> string
val extract_dll_name: bool * string -> string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the extension (.so) is no longer supported when in suffixed mode, aka when the bool is true. (It’s unclear why the suffix is no longer supported, though).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suffix wasn't embedded before either, I think? It's a nod to portability that we specify the name of the dll (e.g. dllunixbyt) and Unix appends .so and Windows appends .dll. In suffixed mode, I was figuring that you're really just giving the central portion and allowing the runtime to do all of the remaining demangling - the problem with allowing .so is that it means the runtime has to insert characters before the extension which seemed a bit odd?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I wasn’t clear: I don’t think supporting .so extensions is useful either; but I’d suggest to update the comment to what is actually supported.

Copy link
Copy Markdown
Contributor

@shym shym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End of my review, with only tiny extra comments.

(** Indicates whether bytecode executables in the compiler distribution
use a launcher that is capable of searching PATH to find ocamlrun. At
present, only native Windows has this behaviour. *)
a launcher that is capable of searching PATH to find ocamlrun. This
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight over-removal in fee65df

Suggested change
a launcher that is capable of searching PATH to find ocamlrun. This
use a launcher that is capable of searching PATH to find ocamlrun. This

l E {|fi |};
l A {|if test -z "$c"; then |};
l A {| echo 'This program requires an OCaml %s interpreter'>&2|} release;
l A {| echo "$r not found either with $0 or in \$PATH">&2 |};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely wonder whether “with” might not be clear enough. Would “alongside” be better? Or explicitly “in the directory of”?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"alongside" sounds better, indeed!

"<method> Specify the mechanism for the bytecode launcher:\n\
\ exe - use the executable launcher in runtime-launch-info\n\
\ sh - use a #!, using sh if the interpreter path cannot be used\n\
\ /path/interpreter - use #!, or the given sh-compatible \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\ /path/interpreter - use #!, or the given sh-compatible \
\ /path/interpreter - use #!, or the given sh-compatible\n\

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that this was formatting correctly without that one, but perhaps not? I'll double-check

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, indeed - corrected!

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 22, 2025

Thanks, @shym! I'll endeavour to rebase this and then put the various fixups inline as extra commits to ease checking (I'll double-check whether the tabs-in-Makefiles is definitely what I'd had in mind as well!)

@dra27 dra27 force-pushed the runtime-searching branch 2 times, most recently from 4cbc8b8 to 1bbf1ea Compare November 9, 2025 12:03
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 9, 2025

Rebased - review responses to follow

@dra27 dra27 force-pushed the runtime-searching branch from 1bbf1ea to 7018d70 Compare November 10, 2025 22:57
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 10, 2025

How's that looking, @shym? I still need to update the man pages and manual for -launch-method, -runtime-search and -dllib-suffixed, but I hope I've addressed everything else!

@dra27 dra27 force-pushed the runtime-searching branch from 7018d70 to abd3e4c Compare November 11, 2025 10:30
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 11, 2025

manpages and documentation now updated, too

@dra27 dra27 force-pushed the runtime-searching branch 2 times, most recently from f4222b9 to fa12746 Compare November 11, 2025 14:14
@dra27 dra27 closed this Nov 11, 2025
@dra27 dra27 reopened this Nov 11, 2025
@dra27 dra27 force-pushed the runtime-searching branch from fa12746 to f43e35c Compare November 12, 2025 08:00
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 12, 2025

Surfacing a side-channel discussion with @shym - I've updated the code around -runtime-search to use constructor names consistent with the present options accepted by the flag, which is disable, enable, always (so "runtime searching is disabled, runtime searching is enabled, runtime searching is always performed"), but we think it is clearer to go with the suggestion in the thread above: disable, fallback, enable, and I'll update the commits to reflect this.

Copy link
Copy Markdown
Contributor

@shym shym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the range-diff and the “Review” commits (thank you so much for that clean update!): I managed to find only tiny typos:

  • in e808946 commit message: is there an “is” missing, in “Misc.RuntimeID is added…”?
  • and the two inlined suggestions.

So this looks good to me (with, as you say, a preference for a D/F/E version)!

Printf.sprintf
" Control the way the bytecode header searches for the interpreter\n\
\ The following settings are supported:\n\
\ disable use a fixed absolute path to the nterpreter\n\
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\ disable use a fixed absolute path to the nterpreter\n\
\ disable use a fixed absolute path to the interpreter\n\

utils/config.mli Outdated
| Disable
(** Interpreter searching disabled - check fixed absolute location only *)
| Enable
(** Check fixed absolute location first, but fallback to a search if that
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wordnet suggests “fallback” is only a noun, not a verb; but it may be overly conservative.

Suggested change
(** Check fixed absolute location first, but fallback to a search if that
(** Check fixed absolute location first, but fall back to a search if that

Copy link
Copy Markdown
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after interactive review; a few changes pending.

@dra27 dra27 force-pushed the runtime-searching branch 2 times, most recently from 1d44ac7 to f6240ed Compare November 30, 2025 18:06
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 30, 2025

(rebased prior to addressing @damiendoligez's review comments)

@dra27 dra27 force-pushed the runtime-searching branch 4 times, most recently from 8e876f9 to 0d887f5 Compare December 3, 2025 15:20
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Dec 3, 2025

I messed up the renaming of the options for -runtime-search - pushed back to just having addressed all the other review comments, and I shall re-do that change a bit more carefully this time (well done CI...)

The changes made so far can be seen in here.

Minor tweaks needed to allow configuring, say, for "$PWD/install'd here"
ocamlobjinfo now parses both RNTM and shebang lines in order to display
the runtime being used by a bytecode executable.
When linking a normal bytecode executable, allows an explicit selection
of either the executable or shebang header, regardless of the value in
runtime-launch-info.
-launch-method encapsulates the first line of runtime-launch-info. The
argument to -launch-method is extended slightly to encompass the second
line, thus `-launch-method 'sh /usr/local/bin'` represents the default
runtime-launch-info file on Unix. Additional fields are added to Config
so that the installed compiler simply uses default values, rather than
reading the two lines from runtime-launch-info. The build of the
compiler itself explicitly uses `-launch-method`, which leaves only the
executable launcher compiled from stdlib/header.c in
runtime-launch-info.
@dra27 dra27 force-pushed the runtime-searching branch from dd4a619 to 7176a48 Compare December 11, 2025 18:23
dra27 added 13 commits December 11, 2025 18:25
-runtime-search {disable|enable|always} adds new features to the
launcher used for bytecode executables which do not embed their own
runtime. By default, the header continues to behave as before - the
launcher will attempt to start the runtime using the absolute path which
the compiler was configured with.

The new search mode will then search for the runtime first in the
directory containing the running executable and then in PATH.
The configuration of a given version of OCaml is now described using a
combination of the host triplet and a new ID value (documented in
runtime/Mangling.md).
Misc.RuntimeID added to manipulate these IDs and configure calculates
the values required for the bytecode and native runtimes. These are then
intended to be used to mangle filenames so that different configurations
which store files in public search paths cease interfering with each
other.
New option --disable-suffixing controls whether the build should use any
of the computed values for mangling its own files.
New names for libcamlrun_shared.so and libasmrun_shared.so without the
_shared suffix and using the target triplet and runtime ID. Both ocamlc
and ocamlopt explicitly recognise `-runtime-variant _shared` and select
the correct name.

Symbolic links for libcamlrun_shared.so and libasmrun_shared.so to allow
any C programs which linked against the the output of `-output-obj` to
continue to work.
ocamlc -dllib-suffixed appends the runtime's host triplet and bytecode
runtime ID to the supplied name when searching for the DLL, and records
the base name only in .cma / executable files.

ocamlmklib -suffixed instructs ocamlmklib to use -dllib-suffixed when
generating .cma files instead of -dllib.

The effect is that stub libraries built this way have names which will
be unique for a given configuration of OCaml and so will be ignored by
other runtimes.
boot/ocamlc now supports everything that the main compiler supports.
--enable-runtime-search controls the -runtime-search setting used to
build the compiler's own bytecode executables;
--enable-runtime-search-target controls the default value of
-runtime-search that ocamlc itself uses.
@dra27 dra27 force-pushed the runtime-searching branch from 7176a48 to 7e71861 Compare December 11, 2025 18:25
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Dec 11, 2025

Hopefully final rebase. Thank you @shym and @damiendoligez for the reviewing for this one as well! It's going through precheck#1089 and, assuming nothing gets thrown up by any of the final CI checks, let's be relocatable...

@dra27 dra27 added the merge-me label Dec 11, 2025
@nojb nojb merged commit da60a2e into ocaml:trunk Dec 12, 2025
37 of 42 checks passed
@nojb
Copy link
Copy Markdown
Contributor

nojb commented Dec 12, 2025

Merged. Congratulations @dra27!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: Full matrix Makes the CI test a bigger set of configurations merge-me relocatable towards a relocatable compiler run-crosscompiler-tests Makes the CI run the Cross compilers test workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants