Relocatable OCaml - --with-relative-libdir#14244
Conversation
shym
left a comment
There was a problem hiding this comment.
I’ve spent a little bit of time reading through this impressive PR, if only because it touches upon cross compilation :-)
My reading of the test part of the PR was really perfunctory, I dug deeper in the other parts but I can’t pretend I always understand what’s at stake. The best comment I can make already is that I didn’t find anything suspicious :-)
Anyway, I thought I’d post my review as it is now, as I have a couple of small remarks that came up during that read.
| if (caml_byte_program_mode != APPENDED || proc_self_exe == NULL) { | ||
| /* First, try argv[0] (when ocamlrun is called by a bytecode program) */ | ||
| exe_name = argv[0]; | ||
| fd = caml_attempt_open(&exe_name, &trail, 0); | ||
| } |
There was a problem hiding this comment.
Naïve question: it’s unclear to me why the condition isn’t just if (proc_self_exe == NULL). Shouldn’t proc_self_exe take always precedence over argv[0] when it can be used?
There was a problem hiding this comment.
There are so many cases that I don't think any question can be naïve! I think some of the commit message for 5395c37 should be in the comment here - my idea here is that when we compile a -custom executable, this only works if a /proc/self/exe mechanism is available.
The rationale behind this is because there's actually an obscure, but nonetheless present, security issue with a -custom executable, since you can abuse argv[0] to cause it to load a different bytecode image from the one actually appended to it.
I haven't done it in this PR, but gpakosz/whereami gives us all the code needed to have caml_executable_name work on all platforms we support.
There was a problem hiding this comment.
Yes, so many cases…
my idea here is that when we compile a
-customexecutable, this only works if a/proc/self/exemechanism is available.
I agree with that principle. I would have expected then that the condition would be an && so that argv[0] is never used in APPENDED mode.
While I’m digging again in that code, I wonder something.
IIUC, caml_main is only ever called in APPENDED and STANDARD (where it opens itself to behave as APPENDED if it finds itself the concatenation of ocamlrun with some bytecode executable) modes, never in EMBEDDED. Now that you distinguish APPENDED from STANDARD, could attempt_open be really run only in the APPENDED case and the control flow be simplified?
There was a problem hiding this comment.
Ah, I realise my comment above describes what I think we should be doing after we know that caml_executable_name always works on all platforms... the || at the moment is a relaxation for platforms where caml_executable_name is not implemented. So that means on Linux/macOS/native-Windows, where proc_self_exe will never be NULL, that means it can't be executed but on, say, Cygwin/*BSD, etc. there is still the fallback.
For caml_attempt_open, I think it's worth considering this w.r.t. #13465, as in that I'm suggesting that EMBEDDED also go through caml_main (but not caml_attempt_open, obviously) for the EMBEDDED case.
runtime/caml/startup.h
Outdated
| EMBEDDED /* bytecode embedded in C (e.g. -output-complete-exe/-output-obj) */ | ||
| }; | ||
|
|
||
| extern enum caml_byte_program_mode caml_byte_program_mode; |
There was a problem hiding this comment.
Do I understand correctly and caml_byte_program_mode could now be declared const?
asmcomp/asmlink.ml
Outdated
| | ({ui_need_stdlib = true; _}, _, _) -> true | ||
| | _ -> false |
There was a problem hiding this comment.
| | ({ui_need_stdlib = true; _}, _, _) -> true | |
| | _ -> false | |
| | ({ui_need_stdlib; _}, _, _) -> ui_need_stdlib |
maybe?
There was a problem hiding this comment.
Oops - I hope that was from it having been more complicated at some point in the past 🫣
| (* Link in a compilation unit *) | ||
|
|
||
| let link_compunit output_fun currpos_fun inchan file_name compunit = | ||
| let link_compunit accu output_fun currpos_fun inchan file_name compunit = |
There was a problem hiding this comment.
Nitpick: the name of the new parameter is the only one not conveying its meaning. acc_need_stdlib is too long, acc_stdlib might be enough?
There was a problem hiding this comment.
Possibly, although this combines with a follow-up PR in #14247 where fold_primitive below is now:
let fold_primitive (needs_stdlib, uses_dynlink) name =so it really just an accumulator being threaded?
bytecomp/bytelink.ml
Outdated
| let standard_library_default = | ||
| if standalone && needs_stdlib then | ||
| (* -set-runtime-default *) | ||
| if !Clflags.standard_library_default = None then | ||
| Some Config.standard_library_effective | ||
| else | ||
| !Clflags.standard_library_default | ||
| else | ||
| (* -custom executables don't need OSLD sections - the correct value | ||
| is already included in the runtime. *) | ||
| None | ||
| in | ||
| begin match standard_library_default with | ||
| | Some value -> | ||
| (* OCaml Standard Library Default location *) | ||
| output_string outchan value; | ||
| Bytesections.record toc_writer OSLD | ||
| | None -> () | ||
| end; |
There was a problem hiding this comment.
I was suprised to see this chunk that way (building an option to destruct it immediately) instead of something like what is in emit_runtime_standard_library_default, namely something like:
| let standard_library_default = | |
| if standalone && needs_stdlib then | |
| (* -set-runtime-default *) | |
| if !Clflags.standard_library_default = None then | |
| Some Config.standard_library_effective | |
| else | |
| !Clflags.standard_library_default | |
| else | |
| (* -custom executables don't need OSLD sections - the correct value | |
| is already included in the runtime. *) | |
| None | |
| in | |
| begin match standard_library_default with | |
| | Some value -> | |
| (* OCaml Standard Library Default location *) | |
| output_string outchan value; | |
| Bytesections.record toc_writer OSLD | |
| | None -> () | |
| end; | |
| if standalone && needs_stdlib then begin | |
| let standard_library_default = | |
| Option.value | |
| (* -set-runtime-default *) | |
| !Clflags.standard_library_default | |
| ~default:Config.standard_library_effective | |
| in | |
| (* OCaml Standard Library Default location *) | |
| output_string outchan standard_library_default; | |
| Bytesections.record toc_writer OSLD | |
| end; | |
| (* else: -custom executables don't need OSLD sections - the correct value is | |
| already included in the runtime. *) |
There was a problem hiding this comment.
Indeed - this particular part (the whole OSLD mechanism) went through so many iterations, I've just ended up with code that's much more obtuse than it needs to be!
Makefile
Outdated
| rm -f $(FLEXDLL_SOURCE_DIR)/flexlink.exe | ||
| $(MAKE) -C $(FLEXDLL_SOURCE_DIR) $(FLEXLINK_BUILD_ENV) \ | ||
| OCAMLOPT='$(FLEXLINK_OCAMLOPT) -nostdlib -I ../stdlib' flexlink.exe | ||
| OCAMLOPT='$(FLEXLINK_OCAMLOPT) -nostdlib -I ../stdlib $(SET_RELATIVE_STDLIB)' flexlink.exe |
There was a problem hiding this comment.
Maybe, to shorten the line (if I’m not mistaken about USE_STDLIB)?
| OCAMLOPT='$(FLEXLINK_OCAMLOPT) -nostdlib -I ../stdlib $(SET_RELATIVE_STDLIB)' flexlink.exe | |
| OCAMLOPT='$(FLEXLINK_OCAMLOPT) $(USE_STDLIB) $(SET_RELATIVE_STDLIB)' \ | |
| flexlink.exe |
configure.ac
Outdated
| AS_CASE([$build_bindir_to_libdir], | ||
| [./*],[libdir="$bindir${build_bindir_to_libdir[#].}"], | ||
| [../*],[libdir="$bindir/$build_bindir_to_libdir"], | ||
| [AC_MSG_ERROR([--with-relative-libdir requires a relative path])])])], |
There was a problem hiding this comment.
| [AC_MSG_ERROR([--with-relative-libdir requires a relative path])])])], | |
| [AC_MSG_ERROR(m4_normalize([--with-relative-libdir requires an explicit | |
| relative path, starting with either ./ or ../]))])])], |
runtime/unix.c
Outdated
| /* If realpath fails, use the non-normalised path for error messages. */ | ||
| if (resolved_candidate != NULL) { | ||
| caml_stat_free(candidate); | ||
| /* caml_realpath uses malloc */ |
There was a problem hiding this comment.
| /* caml_realpath uses malloc */ | |
| /* realpath uses malloc */ |
runtime/win32.c
Outdated
| UNC paths are returned as \\?\UNC\ - in this case, reuse the \\ from | ||
| \\?\. */ |
There was a problem hiding this comment.
Did you mean to say that the last \ in \\?\UNC\ is reused?
There was a problem hiding this comment.
Oh yeah - you may be unsurprised that that comment relates to a different iteration where it was copying the string back on itself - I've tried to reduce all the manually-rolled string functions and switched to caml_stat_wcsdup instead, but missed that comment!
| p "version" version; | ||
| p "standard_library_default" standard_library_default; | ||
| p "standard_library_default" standard_library_effective; | ||
| p "standard_library_relative" standard_library_relative; |
There was a problem hiding this comment.
Minor remark: Config.standard_library_relative is a bool but ocamlopt -config_var standard_library_relative is a string; maybe it doesn’t matter though.
There was a problem hiding this comment.
I haven't actually ended up using Config.standard_library_relative anywhere (in the compiler itself or in any of the adapted programs), but it felt worth having it as an explicitly stated property. We could indeed rename the string here. At the moment, you have a relocatable compiler if test -n $(ocamlopt -config-var standard_library_relative) but it could instead be relocatable if test $(ocamlopt -config-var standard_library_raw) != $(ocamlopt -config-var standard_library_default), say?
There was a problem hiding this comment.
Maybe I should have noted also that ocamlopt -config-var standard_library_default is Config.standard_library_effective rather than ...default. I’m wary of the same names being used to stand for different things between the Config module and the -config cli option, which was really my point (having an empty value correspond to a false makes sense to me).
There was a problem hiding this comment.
Yes, I see what you mean. Perhaps it's just better to have Config.standard_library_default (newly introduced in this PR) / ocamlopt -config-var standard_library_default (which already exists) being the "effective value" (and therefore always an absolute path) and Config.standard_library_relative as a string option - so it's None if the compiler is configured with an absolute libdir, and Some "../lib/ocaml", etc., otherwise? ocamlopt -config-var standard_library_relative then keeps its current interpretation (which maps quite readily from a string option)
There was a problem hiding this comment.
That would indeed make a lot of sense to me!
configure.ac
Outdated
| [AS_IF([test x"$withval" = 'xno'], | ||
| [bindir_to_libdir=''], |
There was a problem hiding this comment.
A quality-of-life suggestion (that could wait for later):
| [AS_IF([test x"$withval" = 'xno'], | |
| [bindir_to_libdir=''], | |
| [AS_CASE([$withval], | |
| [no], | |
| [bindir_to_libdir=''], | |
| [yes], | |
| [bindir_to_libdir=../lib/ocaml], |
(it would also need a nested AS_CASE to yield a Windows path when applicable in the yes case, I suppose).
Noticed this while building a cross-compiler to Windows with relative libdir, which seemed to work as expected (ie I cannot rename the directory in which it is installed, but IIUC only the combination with the other PRs would make that work).
There was a problem hiding this comment.
This is a brilliant idea, thanks! It gets rid of the two cases in the opam file, too (I absolutely want the backslashes on Windows, but I didn't like having to do that so explicitly in the opam file)
| mutable ui_force_link: bool; (* Always linked *) | ||
| mutable ui_for_pack: string option } (* Part of a pack *) | ||
| mutable ui_for_pack: string option; (* Part of a pack *) | ||
| mutable ui_need_stdlib: bool} (* caml_standard_library_nat needed *) |
There was a problem hiding this comment.
What is the release process for .cmx changes? In the semver world this would be a backwards-incompatibility change, and that would require a major version bump. OCaml obviously doesn't do that. Instead, I think it implicitly treats each minor version change as possibly backwards-incompatible, and explicitly bumps Misc.Magic_number when necessary??
Do we need to bump Misc.Magic_number in this commit?
There was a problem hiding this comment.
The magic numbers are bumped unconditionally at each release, so that compiler artifacts are never compatible across different major versions of the compiler.
| | Word_size -> cst make_const_int (8*B.size_int) | ||
| | Int_size -> cst make_const_int (8*B.size_int - 1) | ||
| | Max_wosize -> cst make_const_int ((1 lsl ((8*B.size_int) - 10)) - 1) | ||
| | Ostype_unix -> cst make_const_bool (Config.target_os_type = "Unix") |
There was a problem hiding this comment.
Sigh. TIL that the split between Windows and Unix are plumbed all the way down to closures. Hopefully this is just for some internal flambda optimization. (Unrelated to this PR, but just something I'll have to research later since it sounds like a barrier for Windows<->Unix cross-compilation.)
There was a problem hiding this comment.
This isn't part of this PR (it's got refactored slightly to add the code to emit the link-time constant), but this shouldn't be being antagonistic to cross-compilation, as it's entirely about it! The idea here is that Sys.os_type (and the various other things) take on the value of the target (i.e. the executable being created) rather than those of the host (the compiler).
driver/main_args.ml
Outdated
|
|
||
| let mk_set_runtime_default f = | ||
| "-set-runtime-default", Arg.String f, "<param>=<value> Set the default for \ | ||
| runtime parameter <param> to <value>" |
There was a problem hiding this comment.
This could use a list of supported parameters. ie. <param>: standard_library_default
There was a problem hiding this comment.
Yes, indeed - if (as I hope) we end up also doing PR 11 in #14247 (which sits independently at dra27#186), then this list might get a bit big for a help screen. I wonder if it would be worth pre-emptively putting the list of parameters (even when it's just one) in both the ocamlc/ocamlopt man pages and the manual and then referring to that here? (or possibly going further and having "verbose" help, but as I don't think we do that for anything else yet, that's maybe too excessive!)
|
status: I have reviewed so far the following commits:
|
What's the rationale behind the backporting comment? Even without a magic number bump (and any backports would certainly need one), there's no guarantee that the artefacts of one point release can be used with another? i.e. the magic number exists to protect marshalling and to ensure the compiler doesn't segfault, but I don't think we've ever made the stronger guarantee that, say, artefacts compiled with OCaml x.y.0 can be freely linked with artefacts compiled with x.y.1 |
|
Thank you for the reviews so far, @jonahbeckford and @shym! |
| let ui_need_stdlib = | ||
| List.fold_left (fun acc unit -> acc || unit.ui_need_stdlib) false units | ||
| in |
There was a problem hiding this comment.
| let ui_need_stdlib = | |
| List.fold_left (fun acc unit -> acc || unit.ui_need_stdlib) false units | |
| in | |
| let ui_need_stdlib = List.exists (fun info -> info.ui_need_stdlib) units in |
There's a similar pattern below for ui_force_link.
There was a problem hiding this comment.
I'm not sure what possessed me to write that originally! I blame the pandemic
| link_archive accu output_fun currpos_fun file_name units | ||
|
|
||
| let link_files output_fun currpos_fun = | ||
| List.fold_left (link_file output_fun currpos_fun) false |
There was a problem hiding this comment.
| List.fold_left (link_file output_fun currpos_fun) false | |
| List.exists (link_file output_fun currpos_fun) |
There was a problem hiding this comment.
That's not equivalent, though - we do need to link all of the files, regardless of whether they use that symbol or not!
There was a problem hiding this comment.
Woops, I got carried away.
configure.ac
Outdated
| # It is possible to use MSYS2's "Cygwin" gcc (the equivalent of compiling native | ||
| # Cygwin), in which case $build is *-*-msys*. |
There was a problem hiding this comment.
2025-06-20 - Replacing x86_64-pc-msys with x86_64-pc-cygwin
As part of our ongoing effort to move MSYS2 closer to Cygwin, we have now replaced the
x86_64-pc-msystriplet withx86_64-pc-cygwinas the default host triplet for the MSYS environment.
Not sure if our config.guess / config.sub have caught up with that, or if they're relevant here.
There was a problem hiding this comment.
I think the answer is technically "no" in both cases - however, I think it is relevant inasmuch as it would be worth noting that trying to differentiate MSYS2/Cygwin here (which it isn't, fortunately) is clearly becoming harder/unnecessary.
I didn't understand when the magic number was bumped. It was explained to me. Thanks; ignore the comment. |
| /* See Filename.generic_dirname */ | ||
| size_t n = strlen(path) - 1; | ||
| char *res; | ||
| if (n < 0) /* path is "" */ |
There was a problem hiding this comment.
This should give a compiler warning, no? n is an unsigned size_t which can never be less than zero. Think n should be ssize_t or equivalent.
There was a problem hiding this comment.
I reviewed the ptrdiff_t fix. LGTM.
runtime/win32.c
Outdated
| return caml_stat_wcsdup(stdlib_default); | ||
| } | ||
|
|
||
| CAMLassert(basename != root && Is_separator(*(basename - 1))); |
There was a problem hiding this comment.
| CAMLassert(basename != root && Is_separator(*(basename - 1))); | |
| CAMLassert(basename && basename != root && Is_separator(*(basename - 1))); |
for safety since https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfullpathnamea says:
If lpBuffer refers to a directory and not a file, lpFilePart receives zero.
edit: Actually, what would be better is to return stdlib_default if basename==0, since basename is used later.
There was a problem hiding this comment.
Hang on, is it worth considering the domain of the values possible here - exe_name will be the result of GetModuleFileName so it cannot possibly be a directory nor - given Windows semantics - can it have been replaced by a directory in a race (because the running program can't be deleted).
So I agree it's worth the assertion checking that basename isn't NULL, but surely it's not worth a code path to handle that case, because it's basename being NULL is going to happen with a coding error (i.e. any thing outside an assertion checking this should be an unreachable code path?)
|
Looks good for me after my last two comments. |
|
@dra27 reminded me today to have a look at the middle-end part of the patch (around the |
23733fc to
ce0009a
Compare
|
Rebased - review responses to follow |
ce0009a to
8bbdc56
Compare
|
I have hopefully addressed all the review comments, thank you - both the "Compare" should be working above to see the changes without the noise of the rebase and I have put each change in separate "Review" commits for squashing later. |
|
@lthls - ah, I see what you mean (I think); so instead of compiling it away as we go into {f,c}lambda, instead introduce a new middle-end primitive (which should get rid of the hacking around with |
8bbdc56 to
389fde8
Compare
Yes. I did give it a try last week and it turned out to be less elegant (and more work) than I expected, so I'm now inclined to let the current version stay. |
389fde8 to
62f4517
Compare
|
precheck#1082 seems happy. This one looks ready to go, next - thanks, everyone! |
9d566ec to
eb401f4
Compare
|
Let's try that again - an enhanced battery of tests is running in precheck#1084 as well. |
Both Cygwin and MSYS2 are now consistently detected on MSYS2. In particular, this means that ./configure --prefix $PWD/install and similar will cause the prefix to be correctly translated to a Windows path, as already happens on Cygwin.
Previously, the --prefix argument was always normalised with cygpath -m which meant that regardless of the argument, the paths used in the compiler would always use slashes. This behaviour is preserved if a slash is detected in the argument, i.e. the caller explicitly uses mixed notation (e.g. `--prefix=C:/Prefix` or `--prefix $PWD/install`). In particular, it means that a Cygwin-style path will be correctly converted to a Windows-style path. If the path uses backslashes, then it is still converted to use forward slashes for the installation commands, but the backslashes are otherwise preserved and used within the build itself.
The runtime-launch-info file includes the location of the binary directory. The compiler is extended so that . refers to the directory of the compiler binary.
By default, ocamlrun first tries to resolve argv[0] to determine where the bytecode image is and then tries opening the executable image itself. This is obviously correct for ocamlrun, when being called using a shebang or executable header, but it's not correct for -custom executables where we _know_ that the bytecode image should be with the executable. To achieve this, a new mode is added to caml_byte_program_mode (and the existing ones renamed) such that caml_byte_program_mode is now STANDARD (for ocamlrun - the existing behaviour), APPENDED (for -custom executables - the new behaviour) and EMBEDDED (for -output-complete-exe/-output-obj - the original use of it). The mode is also set directly by the linker, rather than having a default in libcamlrun which is then overridden by the startup code for -output-complete-exe. In the new APPENDED mode, if caml_executable_name is implemented (i.e. it returns a string) then this file _must_ contain the bytecode image and no other mechanisms are used. On platforms where caml_executable_name is not implemented, APPENDED falls back to STANDARD for compatibility. Technically, this stops an argv[0] injection attack on setuid/setgid -custom bytecode executables, although setuid should be used with -output-complete-exe, if at all.
Previously, the bytecode runtime just used OCAML_STDLIB_DIR from build_config.h. This value is now stored once in dynlink.o as caml_runtime_standard_library_default.
%standard_library_default allows Config.standard_library_default to be converted to a compile-time derived value, as with existing compile-time constants such as %backend_type, etc. This paves the way for allowing Config.standard_library_default to be changed at link-time, rather than fixed when the Config module itself is compiled.
Allows the default location used by the bytecode runtime for the Standard Library to be overridden when creating bytecode executables.
Config.standard_library_default is now implemented using the %standard_library_default primitive. This allows a convenient test which can be added for `-set-runtime-default`. The change also makes the host-like nature of of Config.standard_library_default clearer, as the build of the cross-compiler must now (correctly) specify the location of its (target) Standard Library.
When configured with --with-relative-libdir, the runtime uses the directory of the executable to determine the location of the Standard Library. Thus, ocamlrun and the compilers look for ../lib/ocaml by default. This is implemented by changing caml_standard_library_default to be a relative path, and then computing the actual value at startup (for bytecode) and when queried (for native). Executables (and objects) produced by the compiler always have an absolute value of caml_standard_library_default. ocamlc.opt and ocamlopt.opt are built using -set-runtime-default to force caml_standard_library_default to be a relative value.
mingw-w64 is based on GCC, so supports -fdebug-prefix-map, but the test for it is skipped in configure. The test is no longer skipped (which means that Config.c_has_debug_prefix_map returns true) but the flag is still explicitly not used by the compilers (as before).
Indication as to whether ocamlopt assembles files via the C compiler or by calling the assembler directly.
eb401f4 to
02588db
Compare
|
Rebased to clear the merge conflict - while the double-bootstrap provides a welcome warming for my very cold office, can I cajole/goad/bribe another core dev into merging this before the next merge conflict arises? 🙂 |
|
AppVeyor failed, and I have no idea why -- it fails after installing Cygwin, apparently after doing anything OCaml-related, and there is no error message in the logs. Oh well, let's merge anyway. |
It passed on trunk! 😱 |
This is the second of three PRs which implement Relocatable OCaml as proposed in ocaml/RFCs#53. The series of changes in this PR combine to allow the absolute location of the Standard Library (e.g.
/usr/lib/ocaml) to be removed from both the C runtime (ocamlrunandlibcamlrun.a, etc.) and also from theConfigmodule in theocamlcommoncompiler-libs library. The patches address sections 3 & 4 of the RFC.The key changes are:
%standard_library_defaultallows an OCaml program to determine the default of the Standard Library as a compile-time constant. In particular, it means that rather thanConfig.standard_library_defaultbeing fixed once whenconfig.mlitself is compiled, it allows the compiler to determine its value each time a program is linked.-set-runtime-defaultallows the calculated value for the%standard_library_defaultprimitive to be overridden when linking an executable.configureoption,--with-relative-libdir, which allows the compiler to be configured to expect to find the Standard Library in a location specified relative to where the compiler is running from. When./configureis run with no arguments, the default location of the Standard Library is/usr/local/lib/ocamland binaries are installed to/usr/local/bin. The equivalent relative compiler would be configured with--with-relative-libdir=../lib/ocaml.BUILD_PATH_PREFIX_MAPsupport added in Honor BUILD_PATH_PREFIX_MAP #1515 and usage of-fdebug-prefix-map-style options to the C compiler to make the compiler's artefacts considerably more reproducible when--with-relative-libdirhas been specified.These changes necessitate considerable churn in the start-up routines for the bytecode runtime, and in the way
argv[0]is being processed. While this code is in motion, there are several additional changes which aren't a strict requirement of the main change, but are here because this is all being shaken up:configurenow correctly recognises that it is a Cygwin-like environment, and usescygpathet al as necessary.--prefix(e.g../configure --prefix='C:\OCaml') then these are preserved in the resulting compiler (this implements a slightly more sensible version of Display paths using backslashes on Windows #658). Apart from grinding my own axe where this is concerned, it prevents "mixed" slash paths (which look particularly amateur) from being generated when the compiler is configured--with-relative-libdir.-custom, it is possible to direct the resulting executable to load a different bytecode image by manipulatingargv[0]. This issue is fixed here, and on normal systems (wherecaml_executable_nameis implemented), an executable compiled with-customonly loads the bytecode image it was compiled with and can no longer be directed to a load a different one.ocamloptnow callsasdirectly, just as Linux does, rather than going viagcc.Technical background
The crux of this PR is this pair of innocuous-looking lines:https://github.com/dra27/ocaml/blob/0728f6af2aae32a97c2a7a1214c25736a26a479b/runtime/dynlink.c#L91
and
https://github.com/dra27/ocaml/blob/0728f6af2aae32a97c2a7a1214c25736a26a479b/utils/config.generated.ml.in#L23
which, by default, are:
and
Here,
/usr/localis the installation prefix. A consequence of being relocatable, as defined in the RFC, is that this string cannot appear (in any encoding!) in the compiler binaries. At present, the string is present in theConfigmodule and in the bytecode runtime. Its presence in the bytecode runtime means that essentially all bytecode executables contain the location of the OCaml Standard Library, either directly, if compiled with-output-obj,-output-complete-exe, etc., or through theocamlrunexecutable, if compiled with default options or-custom. Conversely, native executables only contain the location of the OCaml Standard Library if they link theConfigmodule from theocamlcommon.cmxa1.For the compiler or runtime, installed to
/usr/local/bin/ocamloptor/usr/local/bin/ocamlrun, it is straightforward forocamlopt/ocamlrunto determine the directory containing the running executable (/usr/local/bin) and to instead contain the location../lib/ocamland thus combine the two to derive/usr/local/lib/ocaml. In the code,../lib/ocamlis the default value, which is static and may be explicit-relative or absolute, where the calculated/usr/local/lib/ocamlis the effective value, which is dynamic and must be absolute. Thus, at module initialisation in:ocaml/utils/config.common.ml.in
Lines 23 to 30 in 0728f6a
we can instead calculate the effective value
"/usr/local/lib/ocaml"from the defaultstandard_library_default = {|../lib/ocaml|}.If only it were that straightforward! The calculation of the effective location requires the caller to agree to be invoked from a binary located in a specific place relative to the effective location. The compiler is not the only consumer of
Config.standard_library. Consider the trivial programdisplay.ml:compiled with:
Today, the two final commands display the same result. If
Config.standard_libraryalways uses the effective value as the default for the Standard Library location,ocamlopt -wherewill continue to display/usr/local/lib/ocamlbut./displaywill suddenly display/home/dra/work/../lib/ocaml(or something similar, but nonetheless not/usr/local/lib/ocaml).Finally, bytecode executables pose an additional problem. Let us extend the trivial program slightly:
compiled with:
$ /usr/local/bin/ocamlc -o display -I +unix -I +compiler-libs unix.cma ocamlcommon.cma display.mlSupposing we have two installations of the same version and configuration of OCaml, one in
/usr/localand another in~/.opam/default. In this case, we have:Now, executing
./displayinvolves loadingdllunixbyt.sofrom thestublibsdirectory. In these two invocations:there are two important things we reasonably expect to happen:
./displaywill display the same path in each case, which will be/usr/local/lib/ocaml(Config.standard_libraryfor the compiler it was built with)ocamlrunwill loaddllunixbyt.sofrom its own installation - i.e./usr/local/bin/ocamlrunwill load/usr/local/lib/ocaml/stublibs/dllunixbyt.soand~/.opam/default/bin/ocamlrunwill load~/.opam/default/lib/ocaml/stublibs/dllunixbyt.soThis leads to what I think is not an entirely obvious distinction. There are two locations to consider: the default location of the Standard Library for the runtime, and the default Standard Library location for the mutator.
Config.standard_library_defaultrefers to the mutator (i.e. program's view), but this may not necessarily be the same value as the runtime needs forOCAML_STDLIB_DIRinruntime/dynlink.c. In practice, this only affects standalone bytecode images - i.e. the situation where the runtime executable and the bytecode image are in separate locations. In all other compilations (including native code; though the native runtime doesn't ever care about the location of the Standard Library), there are still two values, but they are always the same.The changeset is best reviewed commit-by-commit (and, I'm afraid, armed with the "Technical background" explanation...):
enum caml_byte_program_modeis augmented with a newAPPENDEDoption which is used for-custom.caml_mainthen usescaml_byte_program_modeto differentiate between a#!-style "standalone" bytecode image running viaocamlrunand a-customexecutable. While moving the code around, I renamed theCOMPLETE_EXEenumeration constant toEMBEDDEDas the mode is also used with-output-obj.runtime-launch-infois one of two places where the binary directory, rather than library directory is embedded in a file. The format forruntime-launch-infois trivially extended to recognise.as referring to the directory containing the compiler (note thatruntime-launch-infois not a generally-configurable file - supporting arbitrary relative paths here would be a hypothetical installation where the runtime executables are installed to a different location from the compiler, which isn't supported or needed at present).caml_runtime_standard_library_defaultfor the runtime default value, principally so thatOCAML_STDLIB_DIRis only referred to in once place.%standard_library_defaultis the most involved change, introducing a compile-time constant to retrieve the mutator default value. In native code, the Standard Library location is only present if theConfigmodule is linked.ocamlopttherefore has to create the string as part of linking an executable, which is done using a similar mechanism to thecaml_apply,caml_curryandcaml_sendfunctions - a new field in the cmx header records that the compilation unit uses the%standard_library_defaultprimitive. At link-time, if any of the compilation units has set this flag, the linker createscaml_standard_library_natcontaining the location of the standard library and synthesises references to it. For bytecode, a similar technique is used, except that the linker already has the full list of%-primitives which are used by the program, so there is no need for cmo format to be changed. There are various strategies on offer for exactly where and how the value is stored. It is needed to link the bytecode runtime (i.e. the value has to be included in the same places where C tables of primitives and so forth are generated). It is also has to be included in bytecode images where no C is being produced. Given that-output-complete-exeand so forth share the same value for both runtime and mutator, for bytecode images I've opted to add a newOSLDbytecode section containing the mutator value, which is read tocaml_standard_library_default(not tocaml_runtime_standard_library_default). Then, as with the other compile-time constants, it's just a matter of adding anothercaml_sys_const_primitive inruntime/sys.swhich returns an OCaml copy of that string. I've proposed this as a%-primitive, rather than a "known" C primitive for two reasons. Firstly, the behaviour of this is a compile-time constant (i.e. something which the compiler must work out when linking the program), and the other compile-time constants are%primitives. Secondly, the standard library location only wants to be embedded when it's actually needed; if the native runtime used a C primitive, there would have to be a sentinel value or a default forcaml_standard_library_nat- it seems worse to have a C primitive sat in the runtime which could return an invalid value (i.e.caml_sys_const_standard_library_defaultalways returns a correct value in bytecode, but if exposed in native code, it would be possible to end up with a program which called it, but got a default empty string back). Given that these primitives are never intended to be called by user-code (because of the difference in linking), unlike in Always definecaml_startupet al in bytecode, as already done in native code #13465, I think it's better to have the C primitive for bytecode only, and the completely synthesise the function in ocamlopt only when needed.%standard_library_defaultprovides a mechanism which means that the string constant carved intoutils/config.mlwhen the compiler was built can now be determined when the compiler is run. The default value determined by the compiler is simply that same absolute path. The-set-runtime-defaultprovides a mechanism to change that default value when linking a specific program.Config.standard_library_defaultto be changed to be the result of%standard_library_default. This mechanism is a necessary consequence of%standard_library_defaultfor cross-compilation. When linking a cross-compiling version ofocamlopt, that compiler is built with a compiler which uses a host standard library but the resulting cross-compiler it's linking should default to a different target standard library (see the change inMakefile.cross).%standard_library_defaultmeans that"/usr/local/lib/ocaml"has now moved theConfigmodule to the executables themselves. Where before,ocamloptandocamlc.byteboth had the string location linked in code viaConfig.standard_library, it's now instead embedded incaml_standard_library_natforocamlopt(andocamlc.opt) and in anOSLDsection forocamlc.byte. The next commit then allows explicit-relative values of this path to be interpreted by both the runtime and compiler. This is then activated using-set-runtime-default, so thatcaml_standard_library_natis changed to be"../lib/ocaml". Most of the complexity here arises from the fact that the bytecode C runtime needs to be able to do this. Especially in the light of the work onld.confin Relocatable OCaml - explicit-relative paths inld.conf#14243, rather than implementing the logic in both C and OCaml, the logic is implemented just in C and theConfigmodule uses a newcaml_sys_get_stdlib_dirsprimitive, which is passed the result of%standard_library_defaultand returns both the effective value and also the directory containing the executable (which is used to implementConfig.bindirforocamlmklib). Note, in passing, thatocamlmklibon Windows now searches for tools in the same way as on Unix, as there's no need for the PATH-search previously there. The C implementation itself is incaml_locate_standard_librarywhich is implemented separately for Unix and Windows inruntime/unix.candruntime/win32.c. The tools used to implement this (dirname,realpath, etc.) are not equivalently available on Windows, by which I mean that the functions don't have exactly equivalent semantics. In particular, an exactdirnameis not available on Windows, butGetFullPathNamehas the required semantics for this specific use and actually has an option to return the dirname (albeit indirectly). On this occasion, it therefore seemed easier to make the entire "work out the location of the Standard Library" operation platform-specific, than to go to more effort to construct platform-specific building blocks for a generic version of this function.ocamlc.byte -whereproduces the expected result regardless of whether the original or copiedocamlrunis used. If theCI: Full matrixlabel of GHA: add an optional wider test matrix (Cygwin, static, minimal, etc.) #14013 is added, then each of the jobs builds an additional compiler with the alternate configuration (i.e. the jobs which by default are--without-relative-libdirthen have an additional--with-relative-libdirbuild created, and vice versa). These two compilers are likewise used for the "cross-runtime"ocamlc.byte -wheretest.BUILD_PATH_PREFIX_MAPand directly adding-ffile-prefix-mapto our internal C flags. Ignoring#!lines andRNTMsections (which are addressed in the final PR), when building with--with-relative-libdir, none of the binaries in the build contain either the build path or the installation prefix on Windows (with mingw-w64) or Linux.Footnotes
Note that prior to Emancipate dynlink from compilerlibs #11996 in 5.3.0, native code executables which linked
dynlink.cmxaalso contained the location of OCaml Standard Library through the copy of theConfigmodule in theDynlink_compilerlibslibrary. ↩