Introduce wrapper functions for level management#11536
Conversation
|
Very nice! Those functions always bothered me. |
|
While refactoring, could we also rename (I suspect there was a time when |
|
It's perhaps worth discussing here a slightly larger change that I think would be a huge improvement: store the current level as part of the environment. Then instead of |
|
@stedolan @lpw25 |
|
I thought about
I don't find this confusing. In |
Rather than confusing I would describe this as just semantically correct. If you write the declarative typing rules out and don't unaccountably forget to include your type variables in your typing environment then it is clear that the definition and body of a
This is an interesting question. You basically go from something like this, in the current PR: let ty =
wrap_def_iter ~post: generalize (fun () ->
...
)
into something like: let ty
let env = Env.add_level env in
...
in
generalize env tyalthough you could also do something like: let ty =
wrap_new_def env ~post:generalize (fun env ->
...
)
inwith the env-based version too. Of these I personally prefer the env-based versions because they have a value which explicitly represents scopes rather than using global state. I don't want to side-track this PR too much, I just wanted to mention this possibility in case you were keen to try it out now. |
|
Personally I think that tracking the level in the environment has a lot of merit, but I think we could do these sort of refactorings incrementally, leaving this suggestion to a later PR. We had our type-system-refactoring meeting in early November, we bikeshed over the API a bit during the meeting, but our consensus conclusion was that we should limit bike-shedding (I only asked to unify the |
|
@lpw25 So I would like to think carefully about that. |
|
Actually, there may be a way to use the environment meaningfully. type 'a Env.t (* add a phantom parameter *)
type new_env = Env : 'a Env.t -> new_env
type 'a type_scheme
val raise_level: 'a Env.t -> new_env
val generalize: 'a Env.t -> type_expr -> 'a type_scheme
val add_value: Ident.t -> 'a type_scheme -> 'a Env.t -> 'a Env.t
let ty =
let Env env = raise_level env in ...
in
let ts = generalize env ty in
let env = add_valud id ts env in
...This way we really get added safety. |
|
Just adding to this hypothetical design (not sure this is the right place...)
|
|
|
|
I have just finished renaming according to the last meeting. |
and for type variable scoping ([Typetexp.wrap_type_variable_scope]). The older API ([Ctype.(begin_def,end_def)], [Typetexp.(narrow,widen)], etc.) is now removed. The scope of some level management is refined (in particular for [type_application]).
| begin_def (); | ||
| let (args, ty_res) = type_application env funct sargs in | ||
| end_def (); | ||
| unify_var env (newvar()) funct.exp_type; |
There was a problem hiding this comment.
What is going on here? It looks like the behavior is changed.
This seems related to #11519 but the change is different from the one in #11519.
Unless you are very confident that the change is correct, I think we could stick to "just a refactoring" (no change in behavior) in this current PR to make it easier to review, and keep small behavior changes for smaller independent PRs.
There was a problem hiding this comment.
I had to scratch my head for a while to understand why this extra level was needed. It is actually there to warn on partial applications, so I moved the with_local_level to the end of type_application, where it is needed.
Note that type_application is a long sequence of functions, called from the wrapped code, so there is no semantical change.
I also removed the call to unify_var as there is no need to lower the levels of funct.exp_type, since it does not appear in the return type. I don't know why I included this line in 2004, probably did not think much about it.
|
@gasche Thanks for the detailed reviews with many good suggestions. I used most of them, and tried to explain why when I chose a different way. |
Yes, (of course) feel free to disagree with some of the suggestions and explain so. Thanks! I will try to continue my review -- but probably not much in the next few hours. |
| generalize_structure ty; | ||
| let cty, ty, force = | ||
| with_local_level ~post:(fun (_,ty,_) -> generalize_structure ty) | ||
| (fun () -> Typetexp.transl_simple_type_delayed !env sty) |
There was a problem hiding this comment.
I have a naive question about the usage of Typetexp.transl_simple_type_delayed in typecore.
Many of its occurrences are wrapped in a new level, with generalize_structure called on the ty comment (second element of the output triple). (The only two exceptions are the calls in solve_Ppat_poly_constraint and in the Pexp_coerce case.) But looking at the code of transl_simple_type_delayed, the middle component is the result of calling instance on cty.ctyp_type. What is the point of taking an instance and then re-generalizing right after ?
(Should this common behavior be wrapped in a new function of Typetexp, to simplify the logic in Typecore?)
There was a problem hiding this comment.
Good question. I don't have a full answer yet, but a first remark is that the generalization function used inside transl_simple_type_delayed is not the same as the one used outside. The internal one generalizes everything, but the external one doesn't generalize type variables, so it would be incorrect to just use the type inside ctyp. I assume that the reason to generalize inside transl_simple_type_delayed is to keep a generalized version of the type inside ctyp, but I have not yet found where it is used. The goal might just be to be able to print the annotation independently of types inferred from outside.
| ~post: begin fun (exp,vars) -> | ||
| generalize_and_check_univars env "method" exp ty_expected vars | ||
| end | ||
| in |
There was a problem hiding this comment.
I believe this is correct, but I think that the code would be easier to read with a higher-order post function as I suggested during one of our earlier review discussions.
let exp =
with_local_level @@ fun post ->
let vars, ty'' =
with_local_level_if_principal @@ fun post ->
let vars, ty'' = instance_poly true tl ty' in
post generalize_structure ty'';
vars, ty''
in
let exp = type_expect env sbody (mk_expected ty'') in
post (generalize_and_check_univars env "method" exp ty_expected) vars;
expYou were unconvinced at the time, and I think that it is more useful to make progress on the present PR without trying to change the API too much, but I still think this would be a nicer approach in the complex cases, and probably fine in the smiple cases. Maybe we can revisit this option later.
There was a problem hiding this comment.
Currently we are rather thinking of making things more first-order, and exploit typing to enforce invariants. But we have to see what flies.
| let ety = Subst.type_expr Subst.identity body.exp_type in | ||
| replace ety; | ||
| (body, ety) | ||
| end |
There was a problem hiding this comment.
I would find more natural to return { body with exp_type = ety } here, given that the previous body.exp_type is invalid outside the local level/scope.
There was a problem hiding this comment.
I tried the suggested change, but get a different output in typing-poly.
Still wondering what could be the cause; i.e. how can rebuilding a record twice rather than once can be observed.
| end_def (); | ||
| lower_contravariant env arg.exp_type; | ||
| begin_def (); | ||
| let arg = {arg with exp_type = instance arg.exp_type} in |
There was a problem hiding this comment.
Was this instance call useful/necessary? We just type-checked arg at the same level (since the call to type_exp we called end_def followed by begin_def again), and we did not generalize in between, only lowered contravariant variables.
There was a problem hiding this comment.
This indeed does not seem to be necessary.
This may be a leftover from a previous approach to lower_contravariant that simultaneously generalized the structure of the type.
An alternative (tested) version of the code fixing this would be:
let arg, _ =
with_local_level (fun () -> type_exp env sarg, instance ty_arg)
~post: begin fun (arg, ty_arg) ->
lower_contravariant env arg.exp_type;
unify_exp env arg ty_arg;
generalize_and_check_univars env "field value" arg label.lbl_arg vars
end;
in
{arg with exp_type = instance arg.exp_type}This is orthogonal to the current PR, but do you think it should be included?
There was a problem hiding this comment.
I prefer the code you show above to the one in the PR, and I agree that it should be equivalent. If you want to include it in the PR, please feel free; if you want to submit it as a later PR, that is also okay with me. But I think that we want the change in trunk in the medium term.
| @@ -4417,15 +4358,19 @@ and type_label_exp create env loc ty_expected | |||
| with exn when maybe_expansive arg -> try | |||
There was a problem hiding this comment.
The logic here is very tricky, and I'm not fully satisfied with the proposal in the PR:
- it leaves some logic out of the wrapper that clearly belongs to a
post, and - it performs the snapshotting in a way that does not respect the wrapper nesting.
Sketch of alternative proposal (untested):
and type_label_exp create env loc ty_expected
(lid, label, sarg) =
(* Here also ty_expected may be at generic_level *)
let separate = !Clflags.principal || Env.has_local_constraints env in
let (vars, ty_arg) =
with_local_level begin fun () ->
let (vars, ty_arg) =
with_local_level_iter_if separate begin fun () ->
let (vars, ty_arg, ty_res) =
with_local_level_iter_if separate ~post:generalize_structure
begin fun () ->
let ((_, ty_arg, ty_res) as r) = instance_label true label in
(r, [ty_arg; ty_res])
end
in
begin try
unify env (instance ty_res) (instance ty_expected)
with Unify err ->
raise (Error(lid.loc, env, Label_mismatch(lid.txt, err)))
end;
(* Instantiate so that we can generalize internal nodes *)
let ty_arg = instance ty_arg in
((vars, ty_arg), [ty_arg])
end
~post:generalize_structure
in
if label.lbl_private = Private then
if create then
raise (Error(loc, env, Private_type ty_expected))
else
raise (Error(lid.loc, env, Private_label(lid.txt, ty_expected)));
let arg = if vars = [] then Either.Left
(vars, ty_arg)
end
in
let arg =
let exp_instance arg =
{ arg with exp_type = instance arg.exp_type } in
let exception Retry of exn in
try
let snap = if vars = [] then None else Some (Btype.snapshot ()) in
exp_instance @@ with_local_level
begin fun () ->
type_argument env sarg ty_arg (instance ty_arg)
end
~post:begin fun arg ->
if vars <> [] then try
if maybe_expansive arg then
lower_contravariant env arg.exp_type;
generalize_and_check_univars env "field value" arg label.lbl_arg vars;
with exn when maybe_expansive arg ->
(* Try to retype without propagating ty_arg, cf PR#4862 *)
Option.iter Btype.backtrack snap;
raise (Retry exn)
end
with Retry exn -> try
let arg = with_local_level (fun () -> type_exp env sarg)
~post:(fun arg -> lower_contravariant env arg.exp_type)
in
exp_instance @@ with_local_level
begin fun () ->
let arg = exp_instance arg in
unify_exp env arg (instance ty_arg);
arg
end
~post: begin fun arg ->
generalize_and_check_univars env "field value" arg label.lbl_arg vars
end
with Error (_, _, Less_general _) as e -> raise e
| _ -> raise exn (* In case of failure return the first error *)
in
(lid, label, arg)There was a problem hiding this comment.
I'm not too excited by this change just now: while I agree that this makes sense, the code is much changed, and the need to introduce a local exception to be sure that we are not catching an exception coming from type_argument is hairy.
I don't think we really gain in complexity as a result.
It seems that you are trying to fix smells in the original code rather than in this PR, which tries to be minimal.
I agree that the structure of the code is not so nice, but can can we also differ this one for later ?
To answer your points:
- what belongs to
postis not always easy to determine; this is is a first attempt - since snapshotting is only about unification, it is not directly related to levels (but I agree that your proposal looks better ultimately)
There was a problem hiding this comment.
See #11900 for a proposal to merge this change now that the current PR is merged.
gasche
left a comment
There was a problem hiding this comment.
One downside of this PR is that the higher-order wrappers do not play nicely with let-binding scopes. In many cases the previous code looked like
begin_def ();
let x = ... in
let y = ... in
let z = .... in
end_def ();
... (* uses x, y, z *)and the new code now has to look like the more convoluted
let x, y, z =
with_local_level @@ fun () ->
let x = ... in
let y = ... in
let z = ... in
x, y, z
in
...Notice the two repetitions of x, y, z that redundantly expose the environment defined inside the wrapper to the outside.
When several wrappers are nested, this can result in boilerplate-y code. For example:
let (vars, ty_arg, snap, arg) =
with_local_level begin fun () ->
let (vars, ty_arg) =
with_local_level_iter_if separate begin fun () ->
let (vars, ty_arg, ty_res) =
with_local_level_iter_if separate ~post:generalize_structure
begin fun () ->
....i am not sure how to do better, and I do think that the benefits of the PR outweigh the cost in readability that it incurs.
If we had some type inference in the module language, we could write something like this:
(* declaration *)
module With_local_level {module type S} (F : () -> S) () : S = ..
(* user code *)
let open With_local_module(functor () -> struct
let x = ...
let y = ...
)() inbut this proposal is made impractical by the fact that the signature S cannot be inferred, it has to be passed explicitly, introducing a lot of redundancy again.
This could also be attacked at the term level, if we had lightweight extensible records with an open construct, but we currently do not have those.
|
Concerning the interaction between let-binding and scope, I'm not sure it is a bad thing to have to make things more explicit. |
|
Being explicit about the things we take out of the level/scope is good, but right now this involves a lot of redundancy. The "telescope" (the environment exported from within the level to the outside) is duplicated in 2-3 places:
The lists in (1) and (3) sometimes omit some of the bindings, but they still duplicate the arity and this makes changing the number or ordering of exported values tedious. Most of the solutions I considered to avoid this redundancy (none of which work in current OCaml) support being explicit, but without the redundancy. For example we could consider writing: let type r = {x: _; y: _; z: _} in
let t =
with_local_level @@ fun () ->
let x = ... in
let y = ... in
let z = ... in
{x; y; z}
in
let open t in
...(If This assumes that we allow local type definitions to leave some of their type components inferred as weak inference variables, which is definitely not supported today. Having to provide explicit type information here would be too painful. With this approach, we reduce the duplication slightly, as we repeat the telescope at most twice (once in the type declaration, once in the return value; never in the let t =
with_local_level @@ fun () ->
let x = ... in
let y = ... in
let z = ... in
{< x; y; z >}
in
let open t in
...We still have an explicit list of all exported values, but only in one place. |
|
I don't think those features would really make the code easier to read. They make the binding structure of the code more complicated, and so it requires more thinking to understand. If you use records and punning: let { x; y; z } =
with_local_level @@ fun () ->
let x = ... in
let y = ... in
let z = ... in
{ x; y; z }
inthen there is no genuine repetition here in the sense that a machine is checking that you wrote the same thing in both places so you know there is no need to look deeply at both of them. |
|
The problem here is that you have to declare the record type in advance, and in particular give the type of the fields, and this is so syntactically heavy that no one is doing it. |
|
Fair enough, but then I think the language extension you are looking for is just structural record types that support punning. There is no need for some kind of |
| then widen so as to not introduce any new constraints *) | ||
| let z = narrow () in | ||
| (* narrow and widen are now invoked through wrap_type_variable_scope *) | ||
| with_local_type_variable_scope begin fun () -> |
There was a problem hiding this comment.
@goldfirere: you mentioned that you looked at Typetexp.{narrow,widen}; it might be the change to a wrapper combinator proposed in this PR (exemplified in the diff here) conflicts with the refactoring you had in mind. (In that case, I would say submit anyway, we can worry about rebasing once something gets merged.)
gasche
left a comment
There was a problem hiding this comment.
I'm actually done going through this very large PR. I left a few minor comments again, but I definitely haven't found any apparent correctness issue. (And I agree that the code is generally nicer, with my reservation about the telescope redundancy.)
I would like to give @garrigue the opportunity to respond to the last batch of remarks/suggestions/questions, but then I think this should be good to merge.
There was a problem hiding this comment.
Thanks @t6s and @garrigue for this work. It's a very invasive change, but I think that overall it is a nice improvement. I have reviewed the PR and we have discussed every part of the change in details -- if there are any errors left, at least they escaped serious scrutiny. I am approving the PR and I think we should merge.
Comments:
- I still believe that the wrapper style brought by this PR generates hard-to-read / boilerplate-y code when levels get nested locally, due to "telescope redundancy" ( #11536 (review) ). I don't see any solution to this problem with the current features of OCaml -- anonymous/structural records would be nice to have here.
- The current
poststyle is an improvement over what we had before, but it is frequently inflexible and irritating. I mentioned using higher-order functions to "delay" calls placed inside the wrapper instead in #11536 (comment) , and @garrigue mentioned different ideas using typing to keep track of levels. I think this is an area for improvement. I'm not planning to work on it mysel, and I don't know if this should take priority over other work in the COCTI project (there sure is a lot to do in the type-checker codebase), but I am curious to see if something can be done. - I made a few proposals for code changes during my review that @garrigue preferred to defer to later. I think we should consider revisting them soon -- after this is merged. To help with that, I marked all minor comments "resolved"; the comments that remain visible directly in the discussion are those that I think remain relevant now -- including these code-change suggestions.
Thanks again. This was really a lot of work.
This PR introduces wrapper functions for level management (
Ctype.wrap_def, etc)and for type variable scoping (
Typetexp.wrap_type_variable_scope).The older API (
Ctype.{begin_def,end_def},Typetexp.{narrow,widen}, etc.) is removed.The goal is to clarify the structure of generalization inside the type inference algorithm
so as to be able to introduce more abstraction later.
type_application).Pexp_sendis separated intotype_send.type_unpacksis changed to use one recursion rather than twoList.fold_lefts.