Skip to content

Recursive yet unboxed#2188

Merged
gasche merged 15 commits intoocaml:trunkfrom
rlepigre:recursive-yet-unboxed
Jan 29, 2020
Merged

Recursive yet unboxed#2188
gasche merged 15 commits intoocaml:trunkfrom
rlepigre:recursive-yet-unboxed

Conversation

@rlepigre
Copy link
Copy Markdown

@rlepigre rlepigre commented Dec 8, 2018

We (@gasche and I) give a new implementation of the unboxability check for single-constructor types, following theoretical investigations presented in our JFLA 2019 paper (which refines the work of @SimonColin as part of an internship with @gasche). Mutually-recursive type definitions can now contain unboxed declarations, which was not the case before.

@rlepigre rlepigre force-pushed the recursive-yet-unboxed branch from e32351f to 524a1aa Compare December 8, 2018 09:02
@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 8, 2018

The history is not as clean as I would like, but we believe that the code is ready for review. I hope to rebase the PR later and clean the history in passing.

Below are two remarks that I wrote on the implementation.

.

We distinguish the notion of "unboxability" (of type definitions) and
"separability", with respect to the Config.flat_float_array setting:
separability is a semantic notion that is always well-defined
(and does not depend on Config.flat_float_array), but "unboxability"
only depends on separability when Config.flat_float_array is set.

Ideally, the typedecl_separability module would never depend on
Config.flat_float_array, and this flag would be tested outside, by the
caller. In practice, it is more regular (with respect to other
typedecl_* modules) to have the check in the module, but it is only at
the edge, in the compute_decl function that is exposed outside
(and also in the Types.Separability.default_signature function).

.

There is a strange duplication between Types.Separability.default_signature
and Types_separability.msig_of_external_type. They seem to do the same thing,
but they don't have access to (or use) the same information:

  • default_signature is used to construct type declarations from
    nothing; at this point, there is no immediacy information. On the
    other hand, it checks Config.flat_float_array and uses Ind instead
    of Deepsep when it is not set.

  • msig_of_external_type takes the declaration of the type as
    argument. In particular, it checks immediacy (immediate types
    are always separable).

I thought about this, and I don't see any way to do better than having
these two functions. Note that msig_of_external_type is not exposed
directly by the Typedecl_separability signature, but it is accessible
as the default field of Typedecl_separability.property.

@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 8, 2018

(cc @damiendoligez, @yallop and @Armael; you might be interested in this PR)

@Armael
Copy link
Copy Markdown
Member

Armael commented Dec 8, 2018

Don't count on me for this one, I won't have much time until a couple months.

separability cannot be established. *)

type mode = Types.Separability.t = Ind | Sep | Deepsep
(** The mode [Sep] ("separable") is used on type expressions that must be
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment phrases the mode as a requirement on arguments ("type expressions that must be separable"), and that's generally how it's used. But I'd prefer to see it defined more neutrally somewhere as a description of type expressions. One option is to document Types.Separability.t as a (representation of a) property of type expressions, and separately document Typedecl_separability.mode as a requirement on arguments, as it is currently.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this comment to include for each mode, then the type-expression view, and then the type-parameter view. I think the latter is necessary to give intuition (type expressions themselves, at the toplevel, must all be separable).

@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 11, 2018

@yallop: thanks! I fixed the easy issues, changes for the other are yet to come.

@gasche gasche force-pushed the recursive-yet-unboxed branch 2 times, most recently from 0211e96 to f9c6b26 Compare December 14, 2018 05:30
@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 14, 2018

I rebased the PR history, it reads like a novel now. (cc @rlepigre)

@rlepigre
Copy link
Copy Markdown
Author

@gasche: A bestseller for sure!

@gasche gasche force-pushed the recursive-yet-unboxed branch 2 times, most recently from 15c348c to a018864 Compare December 20, 2018 14:15
@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 20, 2018

I rebased against the current trunk and finished taking into accounts the comments of @yallop's first review pass. Thanks!

@gasche gasche force-pushed the recursive-yet-unboxed branch from a018864 to 5e74e19 Compare December 30, 2018 10:54
@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 1, 2019

(cc @let-def, who seemed interested in our presentation of this work at JFLA 2019.)

(Jeremy, it's not clear to me whether you are satisfied with the PR as it is proposed.)

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 24, 2019

There is nothing very urgent on this PR (... well it does fix a limitation with an existing feature), but it would still be nice to find someone interested in making a full review. Maybe someone among @yallop, @Armael, @Octachron, @Drup, @let-def, @trefis?

Copy link
Copy Markdown
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for a few nitpicks.

Copy link
Copy Markdown
Contributor

@let-def let-def left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not done reviewing this PR, but I already have a few minor remarks.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 18, 2019

I am in the progress of rebasing this PR on trunk (it requires a bit of work). Please keep sending the good review comments in the meantime :-)

@rlepigre
Copy link
Copy Markdown
Author

I am in the progress of rebasing this PR on trunk (it requires a bit of work). Please keep sending the good review comments in the meantime :-)

Thanks for taking care of that @gasche. I don't have much time currently, but is there anything I can help with?

@gasche gasche force-pushed the recursive-yet-unboxed branch from 0987527 to bbf373f Compare October 18, 2019 13:36
@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 18, 2019

I just finished rebasing on trunk and pushed to this branch.

@rlepigre: Thanks! I'm happy with dealing with the mostly local/form comments so far, but we can definitely discuss it if more ambitious changes are requested by reviewers.

@smuenzel-js
Copy link
Copy Markdown
Contributor

how's this PR going?

@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 9, 2020

After @smuenzel-js's ping, @damiendoligez started to review again (I think he hasn't validated his last review so his comments do not show up). We discussed the non-principality issue together (to me this is the last standing issue that required a discussion).

Non-principality shows up for types such as

type ('a, 'b) almost_eq =
| Almost_refl : 'c -> ('c, 'c) almost_eq

with (at least) two parameters ('a, 'b) which are in fact always equal, and must be separable. We can give the separability modes (Sep, Ind) or also (Ind, Sep), which allow more programs thant (Sep, Sep), but (Ind, Ind) would be unsound.

We decided that there was, in fact, a more natural choice among the two options: it is more natural to constrain the first/leftmost type parameter more, and to let the second be Ind because it is equal to the first.

I just pushed a last commit to the PR that enforces this ordering, by processing GADT equations from left to right.

Copy link
Copy Markdown
Member

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, modulo a few nitpicks.

In general, the analysis performs a fixpoint computation. It is somewhat
similar to what is done for inferring the variance of type parameters.

Our analysis is defined using inference rules "Def; Gamma |- t : m", in
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, Def; Gamma |- t : m is a statement, not an inference rule.

@gasche gasche force-pushed the recursive-yet-unboxed branch from 65d31e3 to 47590a0 Compare January 27, 2020 16:11
@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 27, 2020

@damiendoligez thanks, I took your suggestions into account and updated the branch.

@gasche gasche force-pushed the recursive-yet-unboxed branch from 47590a0 to 22493b0 Compare January 28, 2020 10:36
@gasche
Copy link
Copy Markdown
Member

gasche commented Jan 28, 2020

I just rebased on trunk again, this should be good to merge once the CI is happy.

Rodolphe Lepigre and others added 9 commits January 28, 2020 11:39
… information

In the future, we could move the arity-related checks
(one constructor, one parameter) and error logic from typedecl.ml to
typedecl_separability.ml.
(this changes the .cmi format and thus requires a bootstrap,
to follow as a separate commit)

(includes bug fixes by Rodolphe Lepigre)
@gasche gasche force-pushed the recursive-yet-unboxed branch from 22493b0 to e0850a8 Compare January 28, 2020 11:05
Rodolphe Lepigre and others added 4 commits January 28, 2020 12:09
Because this changes the separability of standard library types, and
those separabilities are stored in the .cmi files, this commit changes
the .cmi files in the standard libraries in way that appear to require
a bootstrap (it looks like some part of the stdlib is built with
boot/ocamlc and others with ocamlc, and the two should produce/expect
the same .cmi exactly). The bootstrap will come as a separate commit.
@gasche gasche force-pushed the recursive-yet-unboxed branch from e0850a8 to e940f92 Compare January 28, 2020 12:52
The separability signature of a type declaration is not inferred in
a principal way, it depends on the order in which GADT equations are
processed. In non-principal cases, we may have two parameters that are
related by an equality, with one of them being given mode Ind and the
other Sep. Either choice of which to make Sep is sound, but (Ind, Ind)
would be unsound.

We change the implementation to ensure that equations are processed in
an order such that the lefmost parameters are the most constained: if
equations imply that ('a = 'b), with the parameter 'a coming before 'b
in the type declaration, and they must be separable, then 'a gets the
mode Sep and 'b gets the mode Ind. This corresponds intuitively to
remembering that 'b is equal to a previous parameter, instead of
remembering than 'a is equal to a not-seen-yet parameter.
@gasche gasche merged commit 13ab6b4 into ocaml:trunk Jan 29, 2020
@rlepigre rlepigre deleted the recursive-yet-unboxed branch January 29, 2020 09:17
@shindere
Copy link
Copy Markdown
Contributor

shindere commented Feb 3, 2020

It seems that commit 358c7ce breaks trunk
when configured with the
--disable-flat-float-array option.

The problem is very easy to reproduce. Checkkout the commit mentionned
above, git clean -xffd, configure with the --disable-flat-float-array
option and run make. It should lead to inconsistent assumptions over an
interface.

@gasche do you think you'd be able to fix this?

FWIW a job on Inria's CI had detected the problem but it seems it didn't
receive much attention.

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 3, 2020

On my machine, the current HEAD of trunk builds correctly with --disable-flat-float-array, and the testsuite passes. The commit you point to gives a disfunctional state because it requires a bootstrap and we have a not-so-good policy to put bootstraps in separate commit.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Feb 4, 2020 via email

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 4, 2020

So there is indeed a CI bug when --disable-flat-float-array and --enable-flambda are passed at the same time, which I could reproduce on my machine. It looks like a Makefile dependency ordering issue, because I could get the build to work by cleaning up some bits manually and redoing the rest of the build. I'm not sure why only this configuration is affected, but note that this PR changes .cmi format and their digests in ways that required bootstraps, so that may explain why certain cmi-digest issues are now showing up.

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 4, 2020

I'm starting to understand what is going on, writing it down here.

The first thing to understand is that different cmi files get produced by compilers depending on whether --disable-flat-float-array is set: a compiled interface exports the "separability" of each datatype definition, and when flat-float-array is disabled all datatypes are separable, so the information is different (more permissive) from when it is not set.

In the case where --disable-flat-float-array is set, the bootstrap compiler comes from a bootstrap without this setting, so boot/ocamlc and ocamlc generate different cmi. (They have the same format and, in this setting, compiling with either cmis is sound/correct; the bootstrap-compiler-produced cmis are more restrictive, but we know the compiler codebase passes them anyway as they are the standard mode.)

Now here is a partial analysis of the (failing) build logs:

$ grep -E "( map.ml|/misc.ml)" log
../boot/ocamlrun ../boot/ocamlc -use-prims ../runtime/primitives -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error A -bin-annot -nostdlib -safe-string -strict-formats   -o stdlib__map.cmi -c map.mli
../boot/ocamlrun ../boot/ocamlc -use-prims ../runtime/primitives -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error A -bin-annot -nostdlib -safe-string -strict-formats   -o stdlib__map.cmo -c map.ml
./boot/ocamlrun ./boot/ocamlc -g -nostdlib -I boot -use-prims runtime/primitives -strict-sequence -principal -absname -w +a-4-9-40-41-42-44-45-48-66 -warn-error A -bin-annot -safe-string -strict-formats -I utils -I parsing -I typing -I bytecomp -I file_formats -I lambda -I middle_end -I middle_end/closure -I middle_end/flambda -I middle_end/flambda/base_types -I asmcomp -I asmcomp/debug -I driver -I toplevel -c utils/misc.mli
./boot/ocamlrun ./boot/ocamlc -g -nostdlib -I boot -use-prims runtime/primitives -strict-sequence -principal -absname -w +a-4-9-40-41-42-44-45-48-66 -warn-error A -bin-annot -safe-string -strict-formats -I utils -I parsing -I typing -I bytecomp -I file_formats -I lambda -I middle_end -I middle_end/closure -I middle_end/flambda -I middle_end/flambda/base_types -I asmcomp -I asmcomp/debug -I driver -I toplevel -c utils/misc.ml
../boot/ocamlrun ../ocamlc -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error A -bin-annot -nostdlib -safe-string -strict-formats   -o stdlib__map.cmi -c map.mli
../boot/ocamlrun ../ocamlc -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error A -bin-annot -nostdlib -safe-string -strict-formats   -o stdlib__map.cmo -c map.ml
           -o stdlib__map.cmx -c map.ml
./boot/ocamlrun ./ocamlopt -g -nostdlib -I stdlib -I otherlibs/dynlink -strict-sequence -principal -absname -w +a-4-9-40-41-42-44-45-48-66 -warn-error A -bin-annot -safe-string -strict-formats -I utils -I parsing -I typing -I bytecomp -I file_formats -I lambda -I middle_end -I middle_end/closure -I middle_end/flambda -I middle_end/flambda/base_types -I asmcomp -I asmcomp/debug -I driver -I toplevel -function-sections -c utils/misc.ml

What I see is that:

  1. stdlib/map.{mli,ml} is built with boot/ocamlc
  2. utils/misc.{mli.ml} is built with boot/ocamlc
  3. stdlib/map.{mli,ml} is rebuilt with ocamlc, generating different cmi files
  4. utils/misc.ml is rebuilt with ocamlopt to get a .cmx, and there is a cmi mismatch

At step (4), utils/misc.ml is rebuilt (with ocamlopt, but ocamlc would mismatch as well), but misc.mli is not rebuilt. This may sound reasonable as the textual dependencies of misc.{mli,ml} have not changed, but it is in fact incorrect as the previous build happened with the bootstrap-produced stdlib cmis, and we are now building against the normal-compiler-produced stdlib.

I don't think that this PR is to blame for this problem, it's rather one of the rare few PRs doing this kind of things (changing cmis), in a funny enough way (depending on configure flags) that it caught the issue. There is missing dependency information in the Makefile, which should notice that utils/misc.mli needs rebuilding as stdlib/map.cmi changed.

@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 8, 2020

I created a new issue, #9291, to discuss the build failure and Makefile dependency issues. I will not discuss it more on the present thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants