Relax the handling of explicit polymorphic types#1132
Conversation
|
@garrigue Would you be able to look at this? |
|
@garrigue Ping |
|
I'm not yet convinced that this makes the language more usable. After I.e., the situation is improved for explicit universal types at toplevel, but not their combination with local abstract types. |
|
This patch does indeed only fix the situation for universal types and not for local abstract types. It is not immediately clear to me if the same thing can be done for local abstract types as it is a slightly different sort of problem. Possibly it would be safe to just introduce the type constructor at the outer level. However, I think this patch still makes the language noticeably more usable. The error you get for local abstract types is clearer than the one for universal variables because it is a simpler kind of error: there is an obvious type binding at a particular level and that binding is escaping its scope. These scope escape errors are familiar to anyone working with local abstract types. Whereas the universal variable error is more subtle: there is no obvious type binding happening -- the |
|
I'm wondering whether it wouldn't be more natural to change the semantics of to have the |
I'm unsure about this one. I find that beginners are often confused about the difference between universal variables and unification variables. I worry this change would make that confusion worse. Either way it doesn't improve the situation for: let f : 'a . 'a -> 'a = fun (x : 'b) -> x if anything it makes it even more confusing. |
|
It looks like my view is diametrically opposed. I would be quite happy with doing nothing, saying that one can still use locally abstract types, but if we are to do something it should try to solve the real problem. What I suggested is to let type annotations scope type variables, the way it is done in Haskell for instance. For people who don't like the behavior of unification variables, this allows them to completely ignore them. This may look like a big change, but it is essentially compatible, and easy to understand in terms of scope. Of course this would require checking that nothing breaks. On the other hand, examples such as look just strange to me. If you use a different name, why expect it to be the same thing? Because you know the behavior of unification variables: but then you should be aware of the scoping problem too. This said, there is also the solution of going the Standard ML way: have unification variables use the highest level possible. This is is tricky to implement (requires a first pass to insert scoping information in the syntax tree), but this would solve all the problems we have with them. |
I wouldn't say diametrically opposed. I think that the scoping of unification variables and the level at which polymorphic annotations are checked are both problems that should be solved. For instance, here is an example that does not use unification variables: let rec f : 'a. 'a -> 'a = fun x -> g x
and g x = xwhich gives an error with the current compiler: Error: This definition has type 'a -> 'a which is less general than
'a0. 'a0 -> 'a0but is allowed with this patch. This seems like an obvious improvement to me, especially since the original error is a quite confusing one.
I still find the idea of letting the
This would be my preferred solution to the unification scoping issue as I think that it matches peoples expectations quite well without changing the binding rules for |
|
As another point I also think that the approach in this patch of using a form of instantiation to create the polymorphic type and using the occurs check to catch escaping universal variables is cleaner than the old approach of mutating the original type and using levels. In particular, I think it allows better error messages. For example, this code: let rec depth : 'a. 'a option -> _ =
function Some x -> x | None -> assert falsecurrently gives the error: Error: This definition has type 'a option -> 'a which is less general than
'a0. 'a0 option -> 'abut with this patch you get: Error: This expression has type 'a. 'a option -> 'a
but an expression was expected of type 'a. 'a option -> 'b
The universal variable 'a would escape its scope |
|
@garrigue Any chance we can agree on this before the 4.06 freeze? |
|
We should have discussed that at ICFP. As things stand now I stay unconvinced. |
|
Persuading @garrigue is still on my todo list. (I aim to do it at every developer's meeting but then never find the time/remember). |
|
@garrigue and I chatted about this briefly at the recent developer meeting. We agreed that switching from checking polymorphic annotations with only the current recursive binding generalized to checking polymorphic annotations when all the recursive bindings have been generalized is fine. In other words, supporting this: let rec foo : 'a. 'a -> 'a = fun x -> bar x
and bar y = y;;is fine. The details of the actual choice of implementation in this PR still need to be reviewed. |
|
I have looked at the code again. let check_univars env kind exp ty_expected vars =
begin_def ();
let vars, ty = instance_parameterized_type vars exp.exp_type;
end_def ();
(* need to expand twice? cf. Ctype.unify2 *)
(* current code *) Also, where does the need for the extra unification arise from? Still trying to understand what happens there. |
7edab63 to
d598af0
Compare
|
I've rebased the code. To aid review, I'll describe the old approach and the new approach in detail. I'll try to include enough detail so that more than just Jacques and I can follow along. Old approachGiven code like: let rec foo : 'a. 'a -> 'a = e
and ...The old approach would first enter the scope of the definition of Note that these variables are within the scope of The old approach would then type-check Finally it would check that in the result type (It would also mutate After that it continues on to check the remaining mutually recursive bindings. New approachGiven code like: let rec foo : 'a. 'a -> 'a = e
and ...The new approach takes an instance of the explicit polymorphic type -- without entering the scope of It then type-checks Finally it makes an explicitly polymorphic copy of How is the new approach more relaxed?The new approach is more relaxed than the old approach because it allows code like: let rec foo : 'a . 'a -> 'a = fun x -> bar x
and bar = fun y -> yIn the old approach, In the new approach we instead generalize Why do we need to unify with the original polymorphic type?Consider the function: let rec foo : 'a . 'a -> 'd = fun x -> xThis code should be prevented because the expression actually has type In the old approach this code is disallowed because However, the new approach checks generalizability outside the scope of This is why we also unify this polymorphic type with the original explicitly polymorphic type -- This unification ensures that -- even though we allow the polymorphic variables to escape into the types of the other mutually recursive functions in the group -- we only allow expressions whose type is as general as that specified in the annotation. |
|
As a complete newcomer to this problem, here is what I would expect
I'm not sure what it means to type-check an expression with If I understand correctly, this corresponds rather precisely to the The "new approach" mixes several different concepts together In the end, polymorphic recursion has a fairly open design space, and
From your description (which I managed to follow; thanks for being so Here would be an idle idea for a potential approach that would work on This proposal has two properties that I find nice:
|
|
I think the most damning indictment of the existing mechanism is that: module M : sig
val x : 'a -> 'a
end = struct
let x = e
endand let x : 'a. 'a -> 'a = edo not accept the same I don't think it is fair to say that the new approach mixes different concepts. It is simply a new way to check
Personally, I think the new check is less elaborate and more obviously correct. You want to check a type like:
This way you basically replace a normal polymorphic type with the corresponding explicit polymorphic type (i.e. you create Whereas in the current check we do:
Here we are subtly relying on the level mechanism to check something rather than doing so explicitly. |
|
@gasche Since I'm not sure that @lpw25 answered your concerns, here is my assessment. The main goal of this PR is only to relax a bit the way we check polymorphism annotations in recursive bindings. Rather than the local approach I did before (which you accurately described), it checks the constraint after generalising all the bindings. The extra check required shows that some care is needed about "reverse capture", but there is nothing wrong in the principle. The only problem remaining is that the way recursive bindings are checked becoming different from the other uses of universal types, the code in other parts has to be adapted in a not very natural way. It would be nice if it were more uniform. |
|
I'll discuss this PR with @gasche on Friday. |
gasche
left a comment
There was a problem hiding this comment.
I looked at polyfy with Jacques explaining the code to me, so just a minor code-reading comment.
| let rec subst_univar scope vars ty = | ||
| (**** Instantiate a generic type into a poly type ***) | ||
|
|
||
| let polyfy env ty vars = |
There was a problem hiding this comment.
Sorry to bother a native speaker, but why is this not polify?
There was a problem hiding this comment.
Well, the name isn't a real word so the spelling is open to debate. The function makes a type into a Tpoly type hence it "poly"-fies it. The name is also a pun on unify: poly- is the greek prefix for many and uni- the latin prefix for one.
polify is probably more consistent with actual "-fy" words, but I think polyfy makes the meaning clearer.
There was a problem hiding this comment.
I find it confusing to see a name that I can't really pronounce (well in that case, I can pronounce it but it sounds like a spelling mistake).
typing/ctype.ml
Outdated
| ty, complete | ||
| ) | ||
|
|
||
| (* Find all vars in a type *) |
There was a problem hiding this comment.
Already defined: use free_variables (without ~env parameter)
| @@ -3944,22 +3937,34 @@ and type_label_exp create env loc ty_expected | |||
| let arg = type_argument env sarg ty_arg (instance ty_arg) in | |||
| end_def (); | |||
| try | |||
There was a problem hiding this comment.
You're adding a lot of extra code in this case. Couldn't it be factorized?
There was a problem hiding this comment.
I've now factorized out the various generalize calls.
|
I had a discussion about this PR with Jacques, whose point was mainly to understand better Leo's description. I arrived at a description that I understand, which I'm going to copy below. From this description (assuming it indeed coincides with what Leo has implemented), I am now convinced that there are no doubts about soundness with this approach. The description below presents a series of increasingly complex algorithms to check a declaration of the form
(Apologies for the horrible markup, these come from raw-text notes.) I also looked at some code with Jacques, but I cannot do a proper implementation review myself. |
d598af0 to
95ee46c
Compare
|
Personally, I think that is more detail than should really go in that part of the manual. If someone who works on the manual (@Octachron, @gasche?) wants to try and find some reasonable wording to add to the manual then I will happily include it, but I think the PR is fine without it. For compatibility problems I think the Changelog entry is sufficient for telling people when the semantics changed. |
|
I looked at the manual, and indeed there are not much details there on polymorphic recursion, so it is not clear to me that anything in the existing content should be changed -- I agree with Leo's assessment. That said, if we had a more detailed description of polymorphic recursion, I'm not sure what should be changed in it. The problem is that I don't know how to give a specification to the change that Leo proposes, without giving the implementation (which is what most of the PR discussion was about). Leo describes the implementation and gives a few examples that are improved, but what is the general class of improved examples? Is there a simpler way to explain to users which unannotated definitions are now allowed? The "idle proposal" I made above had (a more complex implementation and) some simple and strong properties that could be described easily, without implementation details. In particular: a binding that only depends on annotated bindings will behave as if it was annotated (with respect to the typability of other bindings depending on it). Does a similar property hold with the proposed implementation? |
|
I would not say that this approach can only be described by its implementation.
is well-typed iff there exists a substitution S such that
Except for the introduction of type annotations (which may contain free variables), and the last check, this is the typing rule for ML let in its syntax directed form. This said, I read again the reference manual, and it says nothing about types, so I agree with you two that there is no way to talk about this in the manual. This just shows that the "reference" manual says almost nothing about typability (only a tiny bit about subtyping). |
|
To be a bit more precise, the difference in the previous behavior was that the instance check was local, i.e.
|
|
I actually think that the discussion on this PR, combined with the entry in the Changes file that references this PR is sufficient for documenting the change. Ideally there would be more detailed descriptions of language features somewhere in the manual, but currently we don't have such descriptions and there is no reason to add something special just for this feature. |
|
While you say correctly describes the current state of OCaml development, I find it strange to have to look in the middle of a rather long discussion to find a specification of a feature change. We need to start collecting pieces of the specification in a reliable place. |
|
Just to confirm that this PR is approved. My documentation suggestions are by no way requirements, and more about the future of the language. |
3734038 to
33d4540
Compare
|
I've rebased and fixed the check-typo failure. Ready to merge once the CI is happy. |
|
@Octachron found a problem that seems related with this PR, since it only occurs in 4.11 (and trunk): type 'p pair = 'a * 'b constraint 'p = < left:'a; right:'b>
let fine: <left:'left; right:'right> pair -> <left:'right; right:'left> pair =
fun (x,y) -> y,x
let error: 'left 'right. <left:'left; right:'right> pair -> <left:'right; right:'left> pair =
fun (x,y) -> (y,x)The explicitly polymorphic annotation causes an error, where it was fine in 4.10. |
|
This seems to be a pre-existing bug in unification of polymorphic types which is revealed by this PR: # type 'p pair = 'a * 'b constraint 'p = < left:'a; right:'b>;;
type 'c pair = 'a * 'b constraint 'c = < left : 'a; right : 'b >
# let foo : < m : 'left 'right. <left:'left; right:'right> pair > -> < m : 'left 'right. <left:'left; right:'right> pair > =
fun x -> x;;
Line 2, characters 11-12:
2 | fun x -> x;;
^
Error: This expression has type
< m : 'left 'right. < left : 'left; right : 'right > pair >
but an expression was expected of type
< m : 'left 'right. < left : 'left; right : 'right > pair >
Type < left : 'left; right : 'right > pair = 'a * 'b
is not compatible with type < left : 'left0; right : 'right0 > pair
The method left has type 'a, but the expected method type was 'left
The universal variable 'left would escape its scope |
|
@Octachron Could you make an issue for the bug you found so that we don't loose track of it? |
|
Done with some more context in #9603 . |
|
This looks unsound because of polymorphic references, see #9856. Shall we revert and make an emergency 4.11.1 release? |
|
#9857 fixes the issue. We should make a 4.11.1 release with that in it. |
|
I agree that this requires an early 4.11.1 release, but it is probably better to not shortcut our review process. |
This is an up-to-date version of my patch from PR#6673.
It changes the approach for checking polymorphic types. Instead of generalising the unification variables and then mutating them to be universal variables, it generalises the whole type and then instantiates it with univeral variables for the generalised unification variables. This instance is then unified with the intended polymorphic type -- which allows the usual unification checks to ensure that universal variables do not escape.