Skip to content

[5.1] Avoid a systematic dependency on the ZSTD compression library#12705

Closed
xavierleroy wants to merge 4 commits intoocaml:5.1from
xavierleroy:zstd-dependency-5.1
Closed

[5.1] Avoid a systematic dependency on the ZSTD compression library#12705
xavierleroy wants to merge 4 commits intoocaml:5.1from
xavierleroy:zstd-dependency-5.1

Conversation

@xavierleroy
Copy link
Copy Markdown
Contributor

#12006 introduced optional support for compressing marshaled data using the ZSTD compression library.

After this code was released in OCaml 5.1, #12562 pointed out an unfortunate side effect of the implementation in #12006: if ZSTD is autodetected as present and not disabled by configure --without-zstd, every native executable produced by ocamlopt or ocamlc -custom is linked with ZSTD, usually dynamically, which causes possibly-unwanted dependencies, or statically (if extra flags are passed), which increases the sizes of executables quite a bit.

The reason for this (unexpected) dependency is that all OCaml programs link with the Stdlib standard library module, which defines a Stdlib.output_value function that drags in the runtime code from runtime/extern.c, which contains references to ZSTD (if configured in).

A special task force composed of @dra27 and @xavierleroy and advised by @Octachron and @nojb worked frantically on this issue, producing no less than 7 workarounds. This PR is the workaround that the four of us propose for inclusion in 5.1.1.

With this PR, ocamlopt-generated programs do not depend on the ZSTD dynamic library, unless they use the Marshal standard library module that exposes the 5.1 API for compressed marshaling. (Merely using Stdlib.output_value or Stdlib.input_value does not create a dependency.)

The bytecode interpreter ocamlrun, as well as executables produced by ocamlc -custom, still dynamically link with ZSTD. (A workaround is being considered for ocamlc -custom but not for inclusion in 5.1.1.)

The dependency on ZSTD is relaxed by moving ZSTD-specific code to a new runtime system file, runtime/zstd.c, weakly connected with extern.c and intern.c via hooks. The zstd.c is dragged in and initialized by the Marshal module.

As a bonus, this PR also shrinks the list of compiler modules that the native-code version of Dynlink needs, excluding compiler modules that use the Marshal module for compressed marshaling. Hence, linking with dynlink.cmxa does not add a dependency on ZSTD.

This PR is against 5.1, so as to be integrated in 5.1.1. If accepted, it will be forward-ported to trunk.

@xavierleroy xavierleroy added this to the 5.1.1 milestone Oct 30, 2023
This is achieved by moving the runtime functions that use ZSTD
out of extern.c and intern.c and into a new zstd.c file, which connects
itself to extern.c and intern.c via hooks activated when the Marshal
modules initializes.
Currently, both the bytecode version and the native-code version of
Dynlink drag in many modules from the compiler, including some that
use compressed marshaling.  This causes users of dynlink.cmxa to
depend on -lzstd systematically.

Actually, the native-code version of Dynlink needs much fewer compiler
modules, none of which require compressed marshaling.

This commit just shrinks the list of compiler modules included in
dynlink.cmxa, thus cutting the dependency on -lzstd.
@kit-ty-kate
Copy link
Copy Markdown
Member

kit-ty-kate commented Oct 30, 2023

With this PR, ocamlopt-generated programs do not depend on the ZSTD dynamic library, unless they use the Marshal standard library module that exposes the 5.1 API for compressed marshaling.

Maybe i messed up something with the way I’m testing this but this statement doesn’t seem to hold true (at least on macOS):

$ opam pin
ocaml-variants.5.1.1      git    git+https://github.com/xavierleroy/ocaml#zstd-dependency-5.1
$ cd /tmp
$ touch test.ml
$ ocamlopt test.ml
$ otool -L ./a.out
./a.out:
	/opt/homebrew/opt/zstd/lib/libzstd.1.dylib (compatibility version 1.0.0, current version 1.5.5)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)
$ ocaml --version
The OCaml toplevel, version 5.1.1+dev0-2023-09-14
$ opam switch
#  switch   compiler              description
→  default  ocaml-variants.5.1.1  default

EDIT: The same thing happens on Linux too

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Oct 30, 2023

EDIT: The same thing happens on Linux too

I tested this branch just now under Linux, seems to work fine for me:

nojebar@PERVERSESHEAF:~/ocaml$ local/bin/ocamlopt -config | grep compression
compression_supported: true
nojebar@PERVERSESHEAF:~/ocaml$ true > test.ml
nojebar@PERVERSESHEAF:~/ocaml$ local/bin/ocamlopt test.ml
nojebar@PERVERSESHEAF:~/ocaml$ ldd a.out
        linux-vdso.so.1 (0x00007ffce1ff1000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f9004bee000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f90049e1000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9004d4d000)

@dra27
Copy link
Copy Markdown
Member

dra27 commented Oct 30, 2023

Hmm - I'm seeing no issue on amd64/arm64 Linux, either - I can't test right now on Apple Silicon, but can in the next couple of days...

@kit-ty-kate
Copy link
Copy Markdown
Member

I tested this branch just now under Linux, seems to work fine for me:

I've tried without opam just in case there was some hidden bug here but I'm still getting the same output:

$ git clone https://github.com/xavierleroy/ocaml -b zstd-dependency-5.1
$ cd ocaml
$ ./configure --prefix "$(pwd)/local"
$ make -j9
$ make install
$ true > test.ml
$ ./local/bin/ocamlopt ./test.ml
$ ldd ./a.out
	linux-vdso.so.1 (0x0000ffff65628000)
	libzstd.so.1 => /usr/lib/libzstd.so.1 (0x0000ffff65470000)
	libm.so.6 => /usr/lib/libm.so.6 (0x0000ffff653c0000)
	libc.so.6 => /usr/lib/libc.so.6 (0x0000ffff65200000)
	/lib/ld-linux-aarch64.so.1 => /usr/lib/ld-linux-aarch64.so.1 (0x0000ffff655f4000)

I'm testing it on an ArchLinuxARM derivative. Maybe something is different there

$ uname -rm
6.5.0-asahi-15-1-edge-ARCH aarch64

The default linker is gold (from GNU binutils) and is compiled this way if that matters: https://github.com/archlinuxarm/PKGBUILDs/blob/c133861ee556286b829c23bdf06fe5da21cc3829/core/binutils/PKGBUILD

@dra27
Copy link
Copy Markdown
Member

dra27 commented Oct 30, 2023

I'm seeing the effect on Arch (on amd64) with both -fuse-fd=gold and -fuse-fd=bfd... -lzstd appears to be linking the zstd DLL regardless of whether it's being used or not (I also compiled with ocamlopt -S -dstartup which allows the commands to be repeated by hand... manually omitting the -lzstd still links, of course, and naturally results in no reference to the zstd DLL)

@Octachron
Copy link
Copy Markdown
Member

Octachron commented Oct 30, 2023

It sounds like the issue is that Archlinux contrary to Debian/Ubuntu/Fedora/Gentoo doesn't use --as-needed by default. Should we add -Wl,--as-needed to the default LDFLAGS on linux? (Adding the flag manually fix the issue for me.)

@kit-ty-kate
Copy link
Copy Markdown
Member

kit-ty-kate commented Oct 30, 2023

Just for the sake of completeness since i just finished running all the tests on all those platforms:

I can also reproduce on macOS/x86_64 (running macOS 10.15)

$ ./local/bin/ocamlopt ./test.ml 
$ otool -L ./a.out 
./a.out:
	/opt/local/lib/libzstd.1.dylib (compatibility version 1.0.0, current version 1.4.8)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)

as well as FreeBSD-13.1/arm64:

$ ./local/bin/ocamlopt ./test.ml
$ ldd ./a.out
./a.out:
	libzstd.so.1 => /usr/local/lib/libzstd.so.1 (0x404cd000)
	libm.so.5 => /lib/libm.so.5 (0x405a9000)
	libthr.so.3 => /lib/libthr.so.3 (0x40617000)
	libc.so.7 => /lib/libc.so.7 (0x40673000)

NetBSD-current/arm64:

$ ./local/bin/ocamlopt ./test.ml
$ ldd ./a.out
./a.out:
        -lzstd.1 => /usr/pkg/lib/libzstd.so.1
        -lpthread.1 => /usr/lib/libpthread.so.1
        -lc.12 => /usr/lib/libc.so.12
        -lm.0 => /usr/lib/libm.so.0
        -lgcc_s.1 => /lib/libgcc_s.so.1

and I suspect on OpenBSD-7.3/x86_64 but it ran out of disk in the middle of the test so I'll try again later

$ ./local/bin/ocamlopt ./test.ml
$ ldd ./a.out
./a.out:
	Start            End              Type  Open Ref GrpRef Name
	00000518af3b8000 00000518af430000 exe   2    0   0      ./a.out
	0000051ae20cf000 0000051ae21c5000 rlib  0    1   0      /usr/local/lib/libzstd.so.6.3
	0000051b5adb9000 0000051b5adea000 rlib  0    1   0      /usr/lib/libm.so.10.1
	0000051b80a07000 0000051b80a13000 rlib  0    2   0      /usr/lib/libpthread.so.27.0
	0000051ac36d0000 0000051ac37c6000 rlib  0    1   0      /usr/lib/libc.so.97.0
	0000051ab3812000 0000051ab3812000 ld.so 0    1   0      /usr/libexec/ld.so

but not Windows 10/Cygwin/x86_64 for example:

$ ldd ./local/bin/ocamlopt
        ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ff98cc90000)
        KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL (0x7ff98b880000)
        KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll (0x7ff98a520000)
        cygwin1.dll => /usr/bin/cygwin1.dll (0x7ff949ca0000)
        cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3ff440000)
        cygzstd-1.dll => /usr/bin/cygzstd-1.dll (0x3fda60000)
$ ./local/bin/ocamlopt ./test.ml
$ ldd ./camlprog.exe
        ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ff98cc90000)
        KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL (0x7ff98b880000)
        KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll (0x7ff98a520000)
        cygwin1.dll => /usr/bin/cygwin1.dll (0x7ff949ca0000)
        cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3ff440000)

Should we add -Wl,--as-needed to the default LDFLAGS on linux? (Adding the flag manually fix the issue for me.)

That sounds like the needed fix to me, but not just on Linux.
According to https://www.unix.com/man-page/posix/1/ld/ --as-needed is POSIX-complient so it should be safe to use on all POSIX-like systems.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Oct 30, 2023 via email

@xavierleroy
Copy link
Copy Markdown
Contributor Author

This code is cursed. Yes, this PR requires "as needed" behavior from the C linker. It's supported as an --as-needed option by the GNU linker and the LLVM linker, and the default on some Linux distributions but not all (?).

The show-stopper is macOS, where man ld implies that "as needed" is the default behavior (and lists ways to achieve "no as needed" linking), yet experiments show "no as needed" behavior, and no known options to get "as needed" behavior.

So, I'm closing this PR. Maybe the other task force members can select an alternate approach and shape it as a PR against 5.1.

@dbuenzli
Copy link
Copy Markdown
Contributor

This suggestion may be a bit late, but still.

I don't remember all the experiments but wouldn't it perhaps be better not to compress at the level of marshal ? That is simply have the compiler compress and decompress the results of marshal (and perhaps just add a flag to marshal that makes it compression friendly). Was it considered to slow ?

The idea would be to only have the compiler/compiler-libs (and possibly dynlink) depend on zstd, not the whole runtime system.

@xavierleroy
Copy link
Copy Markdown
Contributor Author

3 or 4 of the 7 workarounds remove compressed marshaling from the stdlib and make compression internal to the compiler. But bootstrapping pretty much requires that compression is part of ocamlrun, and some of the 3-4workarounds also relie on "as-needed linking" of -lzstd. So it's no silver bullet.

@Octachron
Copy link
Copy Markdown
Member

@xavierleroy , a quick search pointed me to -dead_strip_dylibs as the ld64 equivalent of --as-needed (that may be used by ghc)? Does that option fail too?

(On the linux/BSD side, I don't think it is problematic to be less vanilla than archlinux on the default linker option.)

@xavierleroy
Copy link
Copy Markdown
Contributor Author

-dead_strip_dylibs seems to make no difference when building executables:

~/tmp$ cc -Wl,-dead_strip_dylibs hello.c -L/usr/local/opt/zstd/lib -lzstd
~/tmp$ otool -L ./a.out
./a.out:
	/usr/local/opt/zstd/lib/libzstd.1.dylib (compatibility version 1.0.0, current version 1.5.5)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

@dra27
Copy link
Copy Markdown
Member

dra27 commented Oct 31, 2023

The versions with an external library rely on the trick where zstd.o is only linked from libasmrun.a if compression is used, but I think only this version relies on the “as-needed” behaviour. They do indeed have their own warts (either the API is changed, or enabling compression support becomes a little more opaque).

However, I think this PR is recoverable, using the workaround mentioned above for ocamlc -custom.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Nov 6, 2023

-dead_strip_dylibs seems to make no difference when building executables:

~/tmp$ cc -Wl,-dead_strip_dylibs hello.c -L/usr/local/opt/zstd/lib -lzstd
~/tmp$ otool -L ./a.out
./a.out:
	/usr/local/opt/zstd/lib/libzstd.1.dylib (compatibility version 1.0.0, current version 1.5.5)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

I cannot reproduce this behaviour (ie -dead_strip_dylibs seems to work as advertised):

$ echo 'int main() { return 0; }' > hello.c
$ cc hello.c -L/opt/homebrew/opt/zstd/lib -lzstd
$ otool -L ./a.out 
./a.out:
        /opt/homebrew/opt/zstd/lib/libzstd.1.dylib (compatibility version 1.0.0, current version 1.5.5)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
$ cc -Wl,-dead_strip_dylibs hello.c -L/opt/homebrew/opt/zstd/lib -lzstd
$ otool -L ./a.out                                                     
./a.out:

Tested on:

$ uname -a 
Darwin MacBook-Air-de-Urmila.local 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun  8 22:21:34 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T8112 arm64
$ sw_vers 
ProductName:  macOS
ProductVersion:  13.4.1
ProductVersionExtra: (c)
BuildVersion:  22F770820d
$ cc -v
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

@xavierleroy
Copy link
Copy Markdown
Contributor Author

My test was done with slightly more recent software:

~/tmp$ uname -a
Darwin ***** 23.0.0 Darwin Kernel Version 23.0.0: Fri Sep 15 14:42:42 PDT 2023; root:xnu-10002.1.13~1/RELEASE_X86_64 x86_64
~/tmp$ cc -v
Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: x86_64-apple-darwin23.0.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
~/tmp$ sw_vers
ProductName:		macOS
ProductVersion:		14.0
BuildVersion:		23A344

Could be a recent breakage in macOS ld. Someone who likes software archeology could look into the linker sources. At any rate, it's always delicate to rely on undocumented linker flags.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Nov 6, 2023

Could be a recent breakage in macOS ld. Someone who likes software archeology could look into the linker sources. At any rate, it's always delicate to rely on undocumented linker flags.

Just for the record, which version of ld is your system using? The one I tested is:

$ ld -v
@(#)PROGRAM:ld  PROJECT:ld64-857.1
BUILD 23:13:29 May  7 2023
configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em
LTO support using: LLVM version 14.0.3, (clang-1403.0.22.14.1) (static support for 29, runtime is 29)
TAPI support using: Apple TAPI version 14.0.3 (tapi-1403.0.5.1)

Unfortunately, the most recent version of ld64 (the Apple linker) that I could find in the open was 711: https://github.com/apple-oss-distributions/ld64/...

@xavierleroy
Copy link
Copy Markdown
Contributor Author

On my Mac, I have

~/tmp$ ld -v
@(#)PROGRAM:ld  PROJECT:dyld-1015.7
BUILD 18:48:43 Aug 22 2023
configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em
will use ld-classic for: armv6 armv7 armv7s arm64_32 i386 armv6m armv7k armv7m armv7em
LTO support using: LLVM version 15.0.0 (static support for 29, runtime is 29)
TAPI support using: Apple TAPI version 15.0.0 (tapi-1500.0.12.3)
Library search paths:
Framework search paths:

Maybe you need to look into the dyld project instead of the ld64 project? (Shooting in the dark here.)

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Nov 6, 2023

Maybe related:

https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Linking

A new linker has been written to significantly speed up static linking. It’s the default for all macOS, iOS, tvOS and visionOS binaries and anyone using the “Mergeable Libraries” feature. The classic linker can still be explicitly requested using -ld64, and will be removed in a future release.
 (108915312)

@xavierleroy: could you try with -ld64 to see if it makes a difference? (not sure if the flag needs to be passed to the compiler or the linker).

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Nov 6, 2023

Also spotted: the Go developers are passing -Wl,-ld_classic to revert to the "classic" linker for the time being:

golang/go@3ef4f93

@xavierleroy
Copy link
Copy Markdown
Contributor Author

I confirm that adding -ld_classic or -Wl,-ld_classic makes -dead_strip_dylibs work for me:

~/tmp$ cc -ld_classic -Wl,-dead_strip_dylibs hello.c -L/usr/local/opt/zstd/lib -lzstd
~/tmp$ otool -L ./a.out
./a.out:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

-ld64 works too but with a warning saying to use -ld_classic.

So, this could be a workaround, but only until "the classic linker [is] removed in a future release"...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants