Skip to content

Relax the handling of explicit polymorphic types#1132

Merged
lpw25 merged 3 commits intoocaml:trunkfrom
lpw25:relax-poly-types
Jan 20, 2020
Merged

Relax the handling of explicit polymorphic types#1132
lpw25 merged 3 commits intoocaml:trunkfrom
lpw25:relax-poly-types

Conversation

@lpw25
Copy link
Copy Markdown
Contributor

@lpw25 lpw25 commented Apr 3, 2017

This is an up-to-date version of my patch from PR#6673.

It changes the approach for checking polymorphic types. Instead of generalising the unification variables and then mutating them to be universal variables, it generalises the whole type and then instantiates it with univeral variables for the generalised unification variables. This instance is then unified with the intended polymorphic type -- which allows the usual unification checks to ensure that universal variables do not escape.

@mshinwell
Copy link
Copy Markdown
Contributor

@garrigue Would you be able to look at this?

@mshinwell
Copy link
Copy Markdown
Contributor

@garrigue Ping

@garrigue
Copy link
Copy Markdown
Contributor

I'm not yet convinced that this makes the language more usable.
There does not seem to be specific tests (which is strange for a new feature),
but here is the behavior I see:
Before

# let f : 'a. 'a -> 'a = fun (x : 'a) -> x;;
Error: This definition has type 'a -> 'a which is less general than
         'a0. 'a0 -> 'a0

After

# let f : 'a. 'a -> 'a = fun (x : 'a) -> x;;
val f : 'a -> 'a = <fun>
# let f : type a. a -> a = fun (x : 'a) -> x;;
Error: This pattern matches values of type 'a
       but a pattern was expected which matches values of type a
       The type constructor a would escape its scope

I.e., the situation is improved for explicit universal types at toplevel, but not their combination with local abstract types.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented May 15, 2017

This patch does indeed only fix the situation for universal types and not for local abstract types. It is not immediately clear to me if the same thing can be done for local abstract types as it is a slightly different sort of problem. Possibly it would be safe to just introduce the type constructor at the outer level.

However, I think this patch still makes the language noticeably more usable. The error you get for local abstract types is clearer than the one for universal variables because it is a simpler kind of error: there is an obvious type binding at a particular level and that binding is escaping its scope. These scope escape errors are familiar to anyone working with local abstract types. Whereas the universal variable error is more subtle: there is no obvious type binding happening -- the 'a is clearly not bound in the expression -- instead it is about the level at which the generalisation is being checked, an inherently more subtle concept.

@garrigue
Copy link
Copy Markdown
Contributor

I'm wondering whether it wouldn't be more natural to change the semantics of

let f : 'a. 'a -> 'a = fun (x : 'a) -> x

to have the 'a on x bound by the polymorphic annotation.
This is a change in semantics, but I would expect most people to find it intuitive, and I would be very surprised if it would break any program.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented May 18, 2017

but I would expect most people to find it intuitive

I'm unsure about this one. I find that beginners are often confused about the difference between universal variables and unification variables. I worry this change would make that confusion worse.

Either way it doesn't improve the situation for:

let f : 'a . 'a -> 'a = fun (x : 'b) -> x

if anything it makes it even more confusing.

@garrigue
Copy link
Copy Markdown
Contributor

It looks like my view is diametrically opposed.
Namely, for me the problem is not so much that named unification variables clash explicitly polymorphic annotations (this is not a new problem, there have been reports on losing polymorphism due to type annotation for long), but the fact that there is no way to scope these variables.

I would be quite happy with doing nothing, saying that one can still use locally abstract types, but if we are to do something it should try to solve the real problem. What I suggested is to let type annotations scope type variables, the way it is done in Haskell for instance. For people who don't like the behavior of unification variables, this allows them to completely ignore them. This may look like a big change, but it is essentially compatible, and easy to understand in terms of scope. Of course this would require checking that nothing breaks.

On the other hand, examples such as

let f : 'a . 'a -> 'a = fun (x : 'b) -> x

look just strange to me. If you use a different name, why expect it to be the same thing? Because you know the behavior of unification variables: but then you should be aware of the scoping problem too.

This said, there is also the solution of going the Standard ML way: have unification variables use the highest level possible. This is is tricky to implement (requires a first pass to insert scoping information in the syntax tree), but this would solve all the problems we have with them.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented May 19, 2017

It looks like my view is diametrically opposed.

I wouldn't say diametrically opposed. I think that the scoping of unification variables and the level at which polymorphic annotations are checked are both problems that should be solved. For instance, here is an example that does not use unification variables:

let rec f : 'a. 'a -> 'a = fun x -> g x
and g x = x

which gives an error with the current compiler:

Error: This definition has type 'a -> 'a which is less general than
         'a0. 'a0 -> 'a0

but is allowed with this patch. This seems like an obvious improvement to me, especially since the original error is a quite confusing one.

What I suggested is to let type annotations scope type variables, the way it is done in Haskell for instance.

I still find the idea of letting the 'a in let x : 'a. ... = ... scope over the expression potentially confusing. For example, if/when we get polymorphic arguments it seems weird to me that the scoping of 'a in ('a. 'a -> 'a) -> int will be so different from the scoping of 'a in 'a. 'a -> 'a.

This said, there is also the solution of going the Standard ML way: have unification variables use the highest level possible. This is is tricky to implement (requires a first pass to insert scoping information in the syntax tree), but this would solve all the problems we have with them.

This would be my preferred solution to the unification scoping issue as I think that it matches peoples expectations quite well without changing the binding rules for 'a . ....

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented May 19, 2017

As another point I also think that the approach in this patch of using a form of instantiation to create the polymorphic type and using the occurs check to catch escaping universal variables is cleaner than the old approach of mutating the original type and using levels.

In particular, I think it allows better error messages. For example, this code:

let rec depth : 'a. 'a option -> _ =
  function Some x -> x | None -> assert false

currently gives the error:

Error: This definition has type 'a option -> 'a which is less general than
         'a0. 'a0 option -> 'a

but with this patch you get:

Error: This expression has type 'a. 'a option -> 'a
       but an expression was expected of type 'a. 'a option -> 'b
       The universal variable 'a would escape its scope

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Sep 11, 2017

@garrigue Any chance we can agree on this before the 4.06 freeze?

@garrigue
Copy link
Copy Markdown
Contributor

We should have discussed that at ICFP. As things stand now I stay unconvinced.

@xavierleroy
Copy link
Copy Markdown
Contributor

Two years later, did @lpw25 and @garrigue reach an agreement about this PR?

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Oct 15, 2019

Persuading @garrigue is still on my todo list. (I aim to do it at every developer's meeting but then never find the time/remember).

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Nov 15, 2019

@garrigue and I chatted about this briefly at the recent developer meeting.

We agreed that switching from checking polymorphic annotations with only the current recursive binding generalized to checking polymorphic annotations when all the recursive bindings have been generalized is fine.

In other words, supporting this:

let rec foo : 'a. 'a -> 'a = fun x -> bar x
and bar y = y;;

is fine.

The details of the actual choice of implementation in this PR still need to be reviewed.

@garrigue
Copy link
Copy Markdown
Contributor

I have looked at the code again.
I have no problem with what is done in Typecore.type_let, which amounts to delaying the check after all definitions are generalized.
I'm still not much convinced of the need for the other changes. On the one hand, I understand that if the check is done on the type in the final environment, one needs to take a copy. On the other hand wouldn't it be sufficient to take this copy in check_univars before my so-called "dirty" code (which is actually very similar to your subst_univars)?
I.e.

let check_univars env kind exp ty_expected vars =
  begin_def ();
  let vars, ty = instance_parameterized_type vars exp.exp_type;
  end_def ();
  (* need to expand twice? cf. Ctype.unify2 *)
  (* current code *) 

Also, where does the need for the extra unification arise from? Still trying to understand what happens there.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Nov 19, 2019

I've rebased the code.

To aid review, I'll describe the old approach and the new approach in detail. I'll try to include enough detail so that more than just Jacques and I can follow along.

Old approach

Given code like:

let rec foo : 'a. 'a -> 'a = e
and ...

The old approach would first enter the scope of the definition of foo with begin_def () -- adding 1 to the current level. Then it would create an instance of the explicit polymorphic type:

'b -> 'b

Note that these variables are within the scope of foo -- they cannot appear in the final type of foo without first having their level decreased.

The old approach would then type-check e with an expected type of 'b -> 'b, potentially mutating this expected type to something like t1 -> t1.

Finally it would check that in the result type t1 was still a type variable and that it was still "generalizable" -- i.e. that its level had not been decreased so it was still not visible outside the scope of foo.

(It would also mutate t1 to be 'a, which leaves the original type in an ill-formed state but this can only affect tools reading .cmt files since otherwise the original type is unused).

After that it continues on to check the remaining mutually recursive bindings.

New approach

Given code like:

let rec foo : 'a. 'a -> 'a = e
and ...

The new approach takes an instance of the explicit polymorphic type -- without entering the scope of foo:

'b -> 'b

It then type-checks e with an expected type of 'b -> 'b, potentially mutating this expected type to something like t1 -> t1. It also type-checks all the remaining mutually recursive bindings, and generalizes the expected types of all of these including t1 -> t1.

Finally it makes an explicitly polymorphic copy of t1 -> t1. In this copy, if 'b is still a generalized type variable then it will be replaced with an explicit polymorphic variable 'c -- giving us 'c. 'c -> 'c. This type is then unified with the original explicitly polymorphic type 'a. 'a -> 'a.

How is the new approach more relaxed?

The new approach is more relaxed than the old approach because it allows code like:

let rec foo : 'a . 'a -> 'a = fun x -> bar x
and bar = fun y -> y

In the old approach, fun x -> bar x gets type checked with an expected type of 'b -> 'b. Since the parameter is passed to bar 'b gets lifted to a lower lever -- it is visible outside of the scope of foo -- this causes it to fail the "generalizable" check giving an error.

In the new approach we instead generalize 'b outside the scope of foo, which succeeds, allowing us to produce the explicitly polymorphic type 'c. 'c -> 'c that can safely unify with 'a. 'a -> 'a.

Why do we need to unify with the original polymorphic type?

Consider the function:

let rec foo : 'a . 'a -> 'd = fun x -> x

This code should be prevented because the expression actually has type 'a. 'a -> 'a, which is less general than 'a. 'a -> 'd.

In the old approach this code is disallowed because 'd is in the result type of foo. So type-checking fun x -> x with expected type 'b -> 'd will cause 'b to be lifted outside the scope of foo and it will not pass the generalizability check.

However, the new approach checks generalizability outside the scope of foo -- so this example passes that check and allows the explicitly polymorphic type 'c. 'c -> 'c to be produced.

This is why we also unify this polymorphic type with the original explicitly polymorphic type -- 'a. 'a -> 'd -- which fails thanks to the occurs_univars check.

This unification ensures that -- even though we allow the polymorphic variables to escape into the types of the other mutually recursive functions in the group -- we only allow expressions whose type is as general as that specified in the annotation.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 20, 2019

As a complete newcomer to this problem, here is what I would expect
from a high-level perspective when type-checking (let rec ):

  1. We build an environment where each polymorphic-annotated binding is
    assigned a generalized/polymorphic type, and each non-annotated
    binding is assigned an inference variable (in the current scope).

  2. We type check each right-hand-side of each binding in this
    environment, using the assigned type as expected type.

I'm not sure what it means to type-check an expression with
a polymorphic expected type, but that sub-question is independent from
the let rec construct. (I would expect: increase the generalization
level, create an instance with a locally-abstract type at this fresh
level, do the type-checking with this instance as expected type, and
then pop the level again.)

If I understand correctly, this corresponds rather precisely to the
"old approach", except that it uses a different approach to prevent
non-parametric behavior (instead of using an abstract constructor);
and a mutation approach that indeed looks a bit dubious.

The "new approach" mixes several different concepts together
(recursive binding and the checking of polymorphic expected typs), and
it looks a bit strange to me. The explanation in the PR convinces the
reader that removing various checks of the new implementation would be
unsound, but it did not convince me that the result is indeed sound --
could one of the universally-quantified leak in another definition in
a way that preserves generalizability, but is unsound?
(let rec x : 'a . 'a list ref = y and y = ref []?)

In the end, polymorphic recursion has a fairly open design space, and
the solution we use is fairly clear and predictable. You propose to do
something more elaborate in that space. I think that being more
elaborate is only worth it if:

  1. risk is low: the extra sophistication does not endanger soundness.

  2. gain is high: it allows to accept important new examples.

  3. predictability/simplicity of the overall system is preserved

From your description (which I managed to follow; thanks for being so
didactic!), none of (1), (2) or (3) are very clear to me -- but maybe
they are to people more familiar with the problem?


Here would be an idle idea for a potential approach that would work on
your examples -- but I can't tell if your examples are representative
of your real-world needs. Your "problematic" examples fail when we
type-check a polymorphic binding that depends on non-polymorphic
bindings. We could instead start by checking bindings (polymorphic or
not) that only depend on polymorphic bindings; generalize their type,
and insert the generalized/polymorphic type in the typing environment
for the other bindings. Once a binding is generalized in this way, the
set of other bindings that only depend on polymorphic bindings gets
larger. This approach is obviously sound (we only reorder checking,
generalizing after each check).

This proposal has two properties that I find nice:

  • Bindings that only depend on polymorphic bindings have the property
    that they never fail due only to a lack of annotation: if they fail
    to type-check, any clever strategy would have failed them as
    well. So my proposal (unlike the current state) is "optimal" for
    those bindings -- and therefore it is sensible to check them first.

  • I believe that this strategy is optimal/complete in all cases where
    the sub-dependency-graph induced by non-polymorphic bindings only is
    acyclic.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Nov 21, 2019

I think the most damning indictment of the existing mechanism is that:

module M : sig
  val x : 'a -> 'a
end = struct
  let x = e
end

and

let x : 'a. 'a -> 'a = e

do not accept the same es. If e has type 'a. 'a -> 'a then I expect it to be allowed on the right-hand side of let x : 'a. 'a -> 'a = e. This PR makes that be the case.

I don't think it is fair to say that the new approach mixes different concepts. It is simply a new way to check ( ... : 'a. 'a -> 'a). It is accurate enough to allow for recursive bindings to be handled properly, but the concepts do not overlap. The more surprising thing -- see my previous attempts to fix recursive bindings in #6673 -- is that the existing way to check ( ... : 'a. 'a -> 'a) cannot handle the natural way to deal with recursive bindings.

The explanation in the PR convinces the
reader that removing various checks of the new implementation would be
unsound, but it did not convince me that the result is indeed sound
[...]
You propose to do something more elaborate in that space.

Personally, I think the new check is less elaborate and more obviously correct. You want to check a type like: 'a -> 'a against a polymorphic type 'b. 'b -> 'b. You do this in two steps:

  1. Unify 'a -> 'a with an instantiation of 'b. 'b -> 'b (i.e. 'c -> 'c).
  2. Unify 'b. 'b -> 'b with the poly type created by replacing 'c with an univar (i.e. 'd. 'd -> 'd)

This way you basically replace a normal polymorphic type with the corresponding explicit polymorphic type (i.e. you create 'd. 'd -> 'd from 'a -> 'a) and then you unify that with the expected type. By doing things this way you are basically relying on the existing checks in unification to do the work for you.

Whereas in the current check we do:

  1. Unify 'a -> 'a with an instantiation of 'b. 'b -> 'b (i.e. 'c -> 'c).
  2. Use the level mechanism to ensure that 'c appears in the same places as 'b.

Here we are subtly relying on the level mechanism to check something rather than doing so explicitly.

@garrigue
Copy link
Copy Markdown
Contributor

@gasche Since I'm not sure that @lpw25 answered your concerns, here is my assessment.

The main goal of this PR is only to relax a bit the way we check polymorphism annotations in recursive bindings. Rather than the local approach I did before (which you accurately described), it checks the constraint after generalising all the bindings.
I had of course the same concern as you at the beginning.
However, if you look at what is done, there is nothing fancy here.
To convince you of the soundness, just assume that we add the inferred types as polymorphic annotations to the remaining bindings. Then the old method would succeed too.

The extra check required shows that some care is needed about "reverse capture", but there is nothing wrong in the principle.

The only problem remaining is that the way recursive bindings are checked becoming different from the other uses of universal types, the code in other parts has to be adapted in a not very natural way. It would be nice if it were more uniform.

@garrigue
Copy link
Copy Markdown
Contributor

I'll discuss this PR with @gasche on Friday.

Copy link
Copy Markdown
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at polyfy with Jacques explaining the code to me, so just a minor code-reading comment.

let rec subst_univar scope vars ty =
(**** Instantiate a generic type into a poly type ***)

let polyfy env ty vars =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to bother a native speaker, but why is this not polify?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the name isn't a real word so the spelling is open to debate. The function makes a type into a Tpoly type hence it "poly"-fies it. The name is also a pun on unify: poly- is the greek prefix for many and uni- the latin prefix for one.

polify is probably more consistent with actual "-fy" words, but I think polyfy makes the meaning clearer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing to see a name that I can't really pronounce (well in that case, I can pronounce it but it sounds like a spelling mistake).

Copy link
Copy Markdown
Contributor

@garrigue garrigue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found while discussing the idea and code with @gasche.

typing/ctype.ml Outdated
ty, complete
)

(* Find all vars in a type *)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already defined: use free_variables (without ~env parameter)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -3944,22 +3937,34 @@ and type_label_exp create env loc ty_expected
let arg = type_argument env sarg ty_arg (instance ty_arg) in
end_def ();
try
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're adding a lot of extra code in this case. Couldn't it be factorized?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now factorized out the various generalize calls.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 29, 2019

I had a discussion about this PR with Jacques, whose point was mainly to understand better Leo's description. I arrived at a description that I understand, which I'm going to copy below. From this description (assuming it indeed coincides with what Leo has implemented), I am now convinced that there are no doubts about soundness with this approach.

The description below presents a series of increasingly complex algorithms to check a declaration of the form let f : foo = bar where foo is a polymorphic scheme (the fact that recursive bindings have several mutual bindings is elided from this description, which focuses on the core of the levels/generalization/whatever machinery):

  1. First a very naïve approach that I came up when wondering how this is done, then
  2. a more "delayed" approach that is currently used in the type-checker (including to check let-rec bindings), then
  3. a more heavy-weight approach corresponding to (a fictional simplification of) module-signature checks, and finally
  4. Leo's approach that takes elements from both let f : foo = bar (the ascription is known in advance and use to generate an expected type) and from the module-signature check (the respect of the annotation is checked after generalization, rather than before).

(Apologies for the horrible markup, these come from raw-text notes.)

Simple let:

  code:
    let f = t

  algorithm:
    level++
    typecheck f into τ
    level--
    generalize τ
  

Let with polymorphic signature, naïve approach

  code:
    let f : σ = t

  algorithm:
    level++
      construct σ
      level++
      typecheck f at expected type (rigid instance of σ) into τ'
      level--
      (if τ' survived level-- we know the polymorphism of σ is present,
       otherwise either typechecking or level-- would have failed;
       we now throw away τ' which contains rigid variables from the check)
      take τ a new instance of σ
    level--
    generalize τ

  rigid instance: instance with "rigid variables" (local type constructors)
    at the current level

  Problem with this approach: polyrmopshim errors show up as scoping
  errors in the middle of type-checking f, which is not very nice for
  error messages (it's better to talk about the type as whole when
  discussing polymorphism issues).



Let with polymorphic signature, delayed approach

  code:
    let f : σ = t

  algorithm:
    level++
      construct σ
      level++
      typecheck f at expected type (fixed instance of σ) into τ'
      level--
      (at this point we may still have unified fresh variables in ways that 
       are incompatible with rigid polymorphism, so we must still check this)
      verify quantifiers
        (fresh variables coming from the scheme must be
         variables, generalizable and distinct)
      take τ a new instance of σ
    level--
    generalize τ

   fixed instance: instance with fresh unification variables at the current level,
     but rows get fixed in the process (optimisation: this prevents instantiating
     rows; if we did not do this, we would have to check that rows
     were not instantiated after the fact, and this would be more expensive).

   
Let with module signature, simplified description (for comparison)

  code:
    (struct
       let f = t
     end : sig
       val f : σ
     end)

  algorithm:
    struct part:
      level++
      typecheck t into τ
      level--
      generalize τ into σₜ
    sig part:
      check (σₜ < σ)

Let with polymorphic signature, Leo style

  code:
    let f : σ = t

  algorithm:
    level++
    construct σ
    typecheck t at expected type (fixed instance of σ) into τ'
    level--
    generalize τ' + polify + copy into σₜ
    (at this point parts of σ which were involved in typechecking t
     may have been generalized, so we must also:)
    generalize σ into σ'
    check unifiable(structure-instance(σₜ), structure-instance(σ'))

  Note:
    check unifiable(structure-instance(σₜ), structure-instance(σ'))
  is in fact equivalent to
    check (σₜ < σ)
  from the previous system. Indeed, σₜ is obtained
  using an expected type instance(σ) so we already know
  that it is less general than σ.

  Note: the module version does not need `polify` because it uses
  polymorphic signatures of the form ('a -> 'a), not ('a. 'a -> 'a):
  polymorphism is witnessed by generic-level variables, rather than
  explicitly-quantified variables (Tunivar).

I also looked at some code with Jacques, but I cannot do a proper implementation review myself.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Dec 14, 2019

Personally, I think that is more detail than should really go in that part of the manual. If someone who works on the manual (@Octachron, @gasche?) wants to try and find some reasonable wording to add to the manual then I will happily include it, but I think the PR is fine without it.

For compatibility problems I think the Changelog entry is sufficient for telling people when the semantics changed.

@gasche
Copy link
Copy Markdown
Member

gasche commented Dec 14, 2019

I looked at the manual, and indeed there are not much details there on polymorphic recursion, so it is not clear to me that anything in the existing content should be changed -- I agree with Leo's assessment.

That said, if we had a more detailed description of polymorphic recursion, I'm not sure what should be changed in it. The problem is that I don't know how to give a specification to the change that Leo proposes, without giving the implementation (which is what most of the PR discussion was about). Leo describes the implementation and gives a few examples that are improved, but what is the general class of improved examples? Is there a simpler way to explain to users which unannotated definitions are now allowed?

The "idle proposal" I made above had (a more complex implementation and) some simple and strong properties that could be described easily, without implementation details. In particular: a binding that only depends on annotated bindings will behave as if it was annotated (with respect to the typability of other bindings depending on it). Does a similar property hold with the proposed implementation?

@garrigue
Copy link
Copy Markdown
Contributor

garrigue commented Dec 17, 2019

I would not say that this approach can only be described by its implementation.
However, it is true that the implementation is just doing what the specification says.
Namely (ignoring weak type variables),

E |- let rec f_1 : s_1 = e_1 and ... and f_n : s_n = e_n : {f_1 : s'_1, ..., f_n : s'_n}

is well-typed iff there exists a substitution S such that

  • dom(S) is disjoint of ftv(E)
  • E, f_1 : S(s_1), ..., f_n : S(s_n) |- e_i : t_i
  • s'_i = forall V.t_i where V = ftv(t_i) \ ftv(E)
  • s'_i < S(s_i) (i.e. S(s_i) is an instance of s'_i)

Except for the introduction of type annotations (which may contain free variables), and the last check, this is the typing rule for ML let in its syntax directed form.

This said, I read again the reference manual, and it says nothing about types, so I agree with you two that there is no way to talk about this in the manual. This just shows that the "reference" manual says almost nothing about typability (only a tiny bit about subtyping).
So what should be done here? Add the above explanation to changes? Provide this information somewhere else?

@garrigue
Copy link
Copy Markdown
Contributor

To be a bit more precise, the difference in the previous behavior was that the instance check was local, i.e.

  • s''_i = forall V.t_i where V = ftv(t_i) \ ftv(E, f_1 : S(s_1), ..., f_n : S(s_n))
  • s''_i < S(s_i)

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Dec 17, 2019

I actually think that the discussion on this PR, combined with the entry in the Changes file that references this PR is sufficient for documenting the change.

Ideally there would be more detailed descriptions of language features somewhere in the manual, but currently we don't have such descriptions and there is no reason to add something special just for this feature.

@garrigue
Copy link
Copy Markdown
Contributor

While you say correctly describes the current state of OCaml development, I find it strange to have to look in the middle of a rather long discussion to find a specification of a feature change. We need to start collecting pieces of the specification in a reliable place.

@garrigue
Copy link
Copy Markdown
Contributor

Just to confirm that this PR is approved. My documentation suggestions are by no way requirements, and more about the future of the language.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Jan 20, 2020

I've rebased and fixed the check-typo failure. Ready to merge once the CI is happy.

@garrigue
Copy link
Copy Markdown
Contributor

@Octachron found a problem that seems related with this PR, since it only occurs in 4.11 (and trunk):

type 'p pair = 'a * 'b constraint 'p = < left:'a; right:'b>
let fine:  <left:'left; right:'right> pair -> <left:'right; right:'left> pair =
  fun (x,y) -> y,x
let error: 'left 'right. <left:'left; right:'right> pair -> <left:'right; right:'left> pair =
  fun (x,y) -> (y,x)

The explicitly polymorphic annotation causes an error, where it was fine in 4.10.

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented May 19, 2020

This seems to be a pre-existing bug in unification of polymorphic types which is revealed by this PR:

# type 'p pair = 'a * 'b constraint 'p = < left:'a; right:'b>;;
type 'c pair = 'a * 'b constraint 'c = < left : 'a; right : 'b >

# let foo   : < m : 'left 'right. <left:'left; right:'right> pair > -> < m : 'left 'right. <left:'left; right:'right> pair > =
  fun x -> x;;
Line 2, characters 11-12:
2 |   fun x -> x;;
               ^
Error: This expression has type
         < m : 'left 'right. < left : 'left; right : 'right > pair >
       but an expression was expected of type
         < m : 'left 'right. < left : 'left; right : 'right > pair >
       Type < left : 'left; right : 'right > pair = 'a * 'b
       is not compatible with type < left : 'left0; right : 'right0 > pair 
       The method left has type 'a, but the expected method type was 'left
       The universal variable 'left would escape its scope

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented May 26, 2020

@Octachron Could you make an issue for the bug you found so that we don't loose track of it?

@Octachron
Copy link
Copy Markdown
Member

Done with some more context in #9603 .

@xavierleroy
Copy link
Copy Markdown
Contributor

This looks unsound because of polymorphic references, see #9856. Shall we revert and make an emergency 4.11.1 release?
( Cc: @Octachron. )

@lpw25
Copy link
Copy Markdown
Contributor Author

lpw25 commented Aug 25, 2020

#9857 fixes the issue. We should make a 4.11.1 release with that in it.

@Octachron
Copy link
Copy Markdown
Member

I agree that this requires an early 4.11.1 release, but it is probably better to not shortcut our review process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants