[5.1] Avoid a systematic dependency on the ZSTD compression library#12705
[5.1] Avoid a systematic dependency on the ZSTD compression library#12705xavierleroy wants to merge 4 commits intoocaml:5.1from
Conversation
This is achieved by moving the runtime functions that use ZSTD out of extern.c and intern.c and into a new zstd.c file, which connects itself to extern.c and intern.c via hooks activated when the Marshal modules initializes.
Already done in trunk.
Currently, both the bytecode version and the native-code version of Dynlink drag in many modules from the compiler, including some that use compressed marshaling. This causes users of dynlink.cmxa to depend on -lzstd systematically. Actually, the native-code version of Dynlink needs much fewer compiler modules, none of which require compressed marshaling. This commit just shrinks the list of compiler modules included in dynlink.cmxa, thus cutting the dependency on -lzstd.
765a7c7 to
d49bd97
Compare
Maybe i messed up something with the way I’m testing this but this statement doesn’t seem to hold true (at least on macOS): EDIT: The same thing happens on Linux too |
I tested this branch just now under Linux, seems to work fine for me: |
|
Hmm - I'm seeing no issue on amd64/arm64 Linux, either - I can't test right now on Apple Silicon, but can in the next couple of days... |
I've tried without opam just in case there was some hidden bug here but I'm still getting the same output: I'm testing it on an ArchLinuxARM derivative. Maybe something is different there The default linker is |
|
I'm seeing the effect on Arch (on amd64) with both |
|
It sounds like the issue is that Archlinux contrary to Debian/Ubuntu/Fedora/Gentoo doesn't use |
|
Just for the sake of completeness since i just finished running all the tests on all those platforms: I can also reproduce on macOS/x86_64 (running macOS 10.15) as well as FreeBSD-13.1/arm64: NetBSD-current/arm64: and I suspect on OpenBSD-7.3/x86_64 but not Windows 10/Cygwin/x86_64 for example:
That sounds like the needed fix to me, but not just on Linux. |
|
We can also check at configure time wether the linker supports
`--as-needed` and enable the option only if it's supported. Even better
would be tobe able to figure aout whether it's part of the defaults or
not. Well, it may be possible to craft a test to determine whether
unneeded libs are linked, i.e. if the option is necessary because there
actually _is_ a problem to solve.
|
|
This code is cursed. Yes, this PR requires "as needed" behavior from the C linker. It's supported as an The show-stopper is macOS, where So, I'm closing this PR. Maybe the other task force members can select an alternate approach and shape it as a PR against 5.1. |
|
This suggestion may be a bit late, but still. I don't remember all the experiments but wouldn't it perhaps be better not to compress at the level of marshal ? That is simply have the compiler compress and decompress the results of marshal (and perhaps just add a flag to marshal that makes it compression friendly). Was it considered to slow ? The idea would be to only have the compiler/compiler-libs (and possibly dynlink) depend on zstd, not the whole runtime system. |
|
3 or 4 of the 7 workarounds remove compressed marshaling from the stdlib and make compression internal to the compiler. But bootstrapping pretty much requires that compression is part of ocamlrun, and some of the 3-4workarounds also relie on "as-needed linking" of -lzstd. So it's no silver bullet. |
|
@xavierleroy , a quick search pointed me to (On the linux/BSD side, I don't think it is problematic to be less vanilla than archlinux on the default linker option.) |
|
|
|
The versions with an external library rely on the trick where However, I think this PR is recoverable, using the workaround mentioned above for |
I cannot reproduce this behaviour (ie Tested on: |
|
My test was done with slightly more recent software: Could be a recent breakage in macOS ld. Someone who likes software archeology could look into the linker sources. At any rate, it's always delicate to rely on undocumented linker flags. |
Just for the record, which version of Unfortunately, the most recent version of |
|
On my Mac, I have Maybe you need to look into the |
|
Maybe related: https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Linking
@xavierleroy: could you try with |
|
Also spotted: the Go developers are passing |
|
I confirm that adding
So, this could be a workaround, but only until "the classic linker [is] removed in a future release"... |
#12006 introduced optional support for compressing marshaled data using the ZSTD compression library.
After this code was released in OCaml 5.1, #12562 pointed out an unfortunate side effect of the implementation in #12006: if ZSTD is autodetected as present and not disabled by
configure --without-zstd, every native executable produced byocamloptorocamlc -customis linked with ZSTD, usually dynamically, which causes possibly-unwanted dependencies, or statically (if extra flags are passed), which increases the sizes of executables quite a bit.The reason for this (unexpected) dependency is that all OCaml programs link with the Stdlib standard library module, which defines a
Stdlib.output_valuefunction that drags in the runtime code from runtime/extern.c, which contains references to ZSTD (if configured in).A special task force composed of @dra27 and @xavierleroy and advised by @Octachron and @nojb worked frantically on this issue, producing no less than 7 workarounds. This PR is the workaround that the four of us propose for inclusion in 5.1.1.
With this PR, ocamlopt-generated programs do not depend on the ZSTD dynamic library, unless they use the
Marshalstandard library module that exposes the 5.1 API for compressed marshaling. (Merely usingStdlib.output_valueorStdlib.input_valuedoes not create a dependency.)The bytecode interpreter
ocamlrun, as well as executables produced byocamlc -custom, still dynamically link with ZSTD. (A workaround is being considered forocamlc -custombut not for inclusion in 5.1.1.)The dependency on ZSTD is relaxed by moving ZSTD-specific code to a new runtime system file, runtime/zstd.c, weakly connected with extern.c and intern.c via hooks. The zstd.c is dragged in and initialized by the Marshal module.
As a bonus, this PR also shrinks the list of compiler modules that the native-code version of Dynlink needs, excluding compiler modules that use the Marshal module for compressed marshaling. Hence, linking with dynlink.cmxa does not add a dependency on ZSTD.
This PR is against 5.1, so as to be integrated in 5.1.1. If accepted, it will be forward-ported to trunk.