Skip to content

Move compressed marshaling away from the stdlib and into the compilers (5.1 version)#12734

Merged
xavierleroy merged 11 commits intoocaml:5.1from
xavierleroy:no-marshal-compression-flag-1
Nov 20, 2023
Merged

Move compressed marshaling away from the stdlib and into the compilers (5.1 version)#12734
xavierleroy merged 11 commits intoocaml:5.1from
xavierleroy:no-marshal-compression-flag-1

Conversation

@xavierleroy
Copy link
Copy Markdown
Contributor

This is the second attempt at fixing the problem reported in #12562 concerning the compressed marshaling facility introduced in OCaml 5.1 by #12006.

The first attempt #12705 kept the API introduced in #12006 (an extra flag Marshal.Compression in the Marshal standard library module) and tried to make sure that the ZSTD library is dynamically linked with ocamlopt-generated executables only if the Marshal module is actually used. This failed because of limitations of some system linkers.

The present attempt removes the Marshal.Compression API and makes compressed marshaling unavailable via the standard library and private to the OCaml compilers. A new module Compression is added to the compilers and to compiler-libs, and used to compress and uncompress the same compilation artefacts as in #12006. (Unless ZSTD is not available or configured out, in which case plain marshaling is used.)

The implementation reuses parts of the first attempt #12705:

  • The ZSTD-specific C code is moved to runtime/zstd.c and hooks into extern.c and intern.c when activated.
  • The bytecode interpreter ocamlrun and executables produced with ocamlc -custom always include zstd.c and always dynamically link with ZSTD. (Doing otherwise complicates coldstart and bootstrap too much.)

What's new in this second attempt:

  • Native-code executables produced by ocamlopt do NOT include zstd.c and do NOT link with ZSTD, unless they use compiler-libs (see next).
  • zstd.c and -lzstd are automatically included when linking with ocamlcommon.cmxa, the first part of native-code compiler-libs. This makes them usable from ocamlc.opt, ocamlopt.opt, and any native-code program that uses compiler-libs and therefore needs to work with compressed compilation artifacts.

This PR is against 5.1, so that it can be integrated in release 5.1.1. Two points of attention:

  • Accepting this PR means that we introduce an API incompatibility between 5.1.1 and 5.1 (the removal of the Marshal.Compression flag). IIRC, we never did this (a minor release with an incompatible API) before. The working group composed of @dra27, @Octachron, @nojb and I think this is acceptable given the circumstances. This will have to be confirmed at the next developers meeting.
  • The implementation of this PR was intentionally kept small and a bit hackish to minimize the changes in 5.1.1. If it is accepted, we (the working group) have ideas to clean up the implementation and perhaps make compressed marshaling accessible to users again (as a separate library in otherlibs/, explicitly linked, like the Unix library) in time for 5.2.

xavierleroy and others added 6 commits November 4, 2023 19:37
Currently, both the bytecode version and the native-code version of
Dynlink drag in many modules from the compiler, including some that
use compressed marshaling.  This causes users of dynlink.cmxa to
depend on -lzstd systematically.

Actually, the native-code version of Dynlink needs much fewer compiler
modules, none of which require compressed marshaling.

This commit just shrinks the list of compiler modules included in
dynlink.cmxa, thus cutting the dependency on -lzstd.
Manual cherry-pick from 7ed7268 .

Co-authored-by: David Allsopp <david.allsopp@metastack.com>
These functions used to be in extern.c and intern.c.
Now, they are in zstd.c and connect themselves to extern.c and intern.c
via hooks activated by a new primitive `caml_zstd_initialize`.
- Removes the Marshal.Compression constructor, in preparation for
  explicit API in 5.2
- Add just to_channel (for now) to ocamlcommon
- caml_zstd_initialize called from Compression for now.
…value`

The "flags" argument was always [], so remove it.
This avoids an Obj.magic in compression.ml.
@xavierleroy xavierleroy force-pushed the no-marshal-compression-flag-1 branch from d0c4a14 to 63f2cfa Compare November 12, 2023 17:30
Under the hood, it's just `Stdlib.input_value`.
However, it documents the places in the compiler where we expect to read
compressed marshaled data.
…erlibs

We attach -lzstd to ocamlcommon.cmxa and insert zstd.n.o in ocamlcommon.a.
(The latter is a hack, to be replaced by something better eventually.)
@xavierleroy xavierleroy force-pushed the no-marshal-compression-flag-1 branch from 63f2cfa to 768beb3 Compare November 12, 2023 17:37
compilerlibs/ocamlcommon.cmxa: $(COMMON_CMI) $(COMMON:.cmo=.cmx) $(ZSTD_OBJ)
$(V_LINKOPT)$(CAMLOPT) -a -linkall -o $@ \
$(ZSTD_FLAGS) $(COMMON:.cmo=.cmx)
$(V_MKLIB)$(ARCMD) r compilerlibs/ocamlcommon.$(A) $(ZSTD_OBJ)
Copy link
Copy Markdown
Contributor Author

@xavierleroy xavierleroy Nov 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creative use of ar is due to @dra27. If we just add zstd.n.o to the ocamlopt -a command line, this file name is recorded in ocamlcommon.cmxa and passed to the system linker when the latter file is used, but the system linker will not find it and will not search for it in the -L search path. Instead, we include zstd.n.o and its contents in the ocamlcommon.a archive, where the system linker will find it reliably.

It's unclear whether this hack can work with the MSVC librarian lib. Good thing we're not supporting MSVC again yet.

To make things cleaner in the future, we can either

  • Change ocamlopt -a so that all .o files given on the command line are included in the generated .a file. The current behavior (of recording the file names in the .cmxa file) is useless anyway.
  • Put zstd.n.o in a proper .a archive, and install this archive. This would be the case if we make an otherlibs that exposes compressed marshaling to users.

Copy link
Copy Markdown
Member

@kit-ty-kate kit-ty-kate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successfully tested on macOS, FreeBSD and Alpine Linux. Thank you so much!

@xavierleroy
Copy link
Copy Markdown
Contributor Author

The incompatible API change (removal of Marshal.Compression flag and Marshal.compression_available function) was approved at this week's developer meeting. @damiendoligez promised a review of this PR.

@xavierleroy xavierleroy added this to the 5.1.1 milestone Nov 18, 2023
Copy link
Copy Markdown
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good modulo two nitpicks.

@damiendoligez
Copy link
Copy Markdown
Member

Because #12006 was released, you'll also need a Changes entry for this PR.

xavierleroy added a commit to xavierleroy/ocaml that referenced this pull request Nov 20, 2023
- Add type constraint in dynlink/byte/dynlink.ml
- Reword configure message about ZSTD support

Co-authored-by: Damien Doligez <damien.doligez@gmail.com>
@xavierleroy xavierleroy force-pushed the no-marshal-compression-flag-1 branch from 3814387 to d3f94e9 Compare November 20, 2023 15:04
- Add type constraint in dynlink/byte/dynlink.ml
- Reword configure message about ZSTD support

Co-authored-by: Damien Doligez <damien.doligez@gmail.com>
@xavierleroy xavierleroy force-pushed the no-marshal-compression-flag-1 branch from d3f94e9 to 0713d6a Compare November 20, 2023 15:12
@xavierleroy xavierleroy force-pushed the no-marshal-compression-flag-1 branch from 0713d6a to 49b6988 Compare November 20, 2023 15:23
@xavierleroy
Copy link
Copy Markdown
Contributor Author

CI is green (here and on Jenkins). Merging in 5.1 branch. Will port to trunk and improve soon.

@xavierleroy xavierleroy merged commit 9f9c1f7 into ocaml:5.1 Nov 20, 2023
@xavierleroy xavierleroy changed the title Move compressed marshaling away from the stdlib and into the compilers Move compressed marshaling away from the stdlib and into the compilers (5.1 version) Nov 21, 2023
const unsigned char *,
uintnat) = NULL;

CAMLprim value caml_compression_available(value vunit)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used, is it intended ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed. To avoid an unnecessary bootstrap, the primitive will be removed before at release.


CAMLprim value caml_zstd_initialize(value vunit)
{
return Val_unit;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Val_false;

{
caml_extern_compress_output = caml_zstd_compress;
caml_intern_decompress_input = caml_zstd_decompress;
return Val_unit;
Copy link
Copy Markdown
Contributor

@hhugo hhugo Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Val_true;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8abed5e .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants