Allow to name closed types in constructor type annotations by garrigue · Pull Request #12507 · ocaml/ocaml

garrigue · 2023-08-28T06:22:24Z

In #11891 it was pointed that, for constructor type annotations, the current approach of only allowing to name existentials that do not appear in the return type was too restrictive, as existentials could also appear through unification of the return type.

This PR relaxes this restriction, allowing to name any closed type appearing in the type of the arguments.
This seems to answer the request in #11891, however this means that we are now using local type definitions for two different things: existentials introduced by constructors, and aliases written in the annotation.
Note that while it could be possible to restrict those aliases to existentials, I'm not sure this would be a stable property: according to the order of unification, an existential could have been already refined. So I would prefer to stick to this relaxed definition. Note that there is still a scope restriction, so that one can only alias already known closed types.

Octachron · 2023-09-01T09:35:59Z

typing/typecore.ml

  if ids <> [] then ignore begin
+    let get_decl_manifest id env =
+      let decl =
+        try Env.find_type (Path.Pident id) env with Not_found -> assert false in


Is there a reason that the code is re-querying the declaration and manifest of the ids types from the environment? Would it not be more straightforward to store those in the ids list?

This avoids thinking hard about the code, but indeed it is ok to keep the values along, so I changed the code.

lpw25 · 2023-09-04T11:52:34Z

I think that the behaviour in this PR is surprising and confusing. I would expect any binding of the form (type a) to introduce an abstract type, but here it is introducing an alias. I'm fine with the abstract type being equal to types other than existentials outside of its scope, but within its scope it should be abstract.

gasche · 2023-09-20T13:27:38Z

We discussed this in today's type-system meeting (with @garrigue, @goldfirere, @Octachron and @t6s). I would like to find something that let us name more types than today, because un-nameable types are unergonomic, but @lpw25 does not want to give names to types that can already be named (or at least some of them). So we tried to agree on a specification that makes me happy and, we hope, would also please Leo.

The proposal is as follows: allow to give a name to local abstract types that were introduced in the current pattern-matching clause.

Example:

type _ ty =
  | Int : int ty
  | Pair : 'b ty * 'c ty -> ('b * 'c) ty


(* Gabriel wants this accepted. The criterion accepts it. *)
let example1 : type a . a ty -> unit = function
| Int -> ()
| Pair (type b c) (x, y : b ty * c ty) -> ()

(* The criterion obviously rejects this, which requires the equation (b = int) *)
let example2 : (int * int) ty -> int = function
| Pair (type b c) (x, y : b ty * c ty) -> (5 : b)

(* The criterion also rejects this,
   even though it could be accepted if b, c were forced to remain abstract *)
let example2 : (int * int) ty -> unit = function
| Pair (type b c) (x, y : b ty * c ty) -> ()

(* The criterion also rejects this:
    a1, a2 are defined "earlier" than the pattern matching clause. *)
let example2 : type a1 a2 . (a1 * a2) ty -> unit = function
| Pair (type b c) (x, y : b ty * c ty) -> ()

@garrigue thinks that this restriction would be easy to implement, and we hope that it would make @lpw25 happy. I would be happy with it -- I think we could also justify accepting more, but I mostly care about the types that cannot be given a name otherwise.

garrigue · 2023-09-21T03:09:29Z

I have implemented the restriction that we discussed yesterday.
Now only types that were freshly introduced can be bound, and only once.
Does it assuage @lpw25 's reservations ?

garrigue · 2023-09-21T03:27:55Z

Note that what I implemented is actually stricter than what @gasche suggested.
That is, one can only bind the existentials that were generated by the unification of the constructor they annotate.
So there is only one place the binding can be done.
Buiding a counter-example is a bit tricky though.

garrigue · 2023-09-21T04:50:45Z

I added an example showing the difference:

type 'a wrap = Wrap of 'a
type _ ty = Int : int ty | Pair : ('b ty * 'c ty) wrap -> ('b * 'c) ty
(* ok *)
let rec default : type a. a ty -> a = function
  | Int -> 0
  | Pair (type b c) (Wrap (b, c) : (b ty * c ty) wrap) ->
      (default b : b), (default c : c)
(* ko *)
let rec default : type a. a ty -> a = function
  | Int -> 0
  | Pair (Wrap (type b c) (b, c : b ty * c ty)) ->
      (default b : b), (default c : c)

The first default is fine because we bind b and c just when they are introduced, but the second one fails as it attempts to bind b and c to pre-existing existentials (introduced by Pair).

gasche · 2023-09-21T06:23:04Z

Yes, it makes a lot of sense to restrict to binding existentials introduced by the constructor. This is not mentioned in my restriction above because I wasn't sure it would be easy to implement / how to formulate the restriction.

lpw25 · 2023-09-21T06:27:40Z

Does it assuage @lpw25 's reservations ?

Yes, I'm happy with the new behaviour.

gasche · 2023-09-21T06:51:27Z

Great, so all that is left to do is a code review :-)

Minor note: the testsuite is failing with the finaliser_handover test.

2023-09-21T06:32:10.0730084Z  ... testing 'finaliser_handover.ml' with 1 (native) Process 60125 got signal 6(Aborted), core dumped

This test is known to be flaky (#12345) and the result can be ignored in the context of this PR.

garrigue · 2023-09-22T00:01:08Z

Note that there is still one situation that is not handled after this PR, namely when the newly introduced existential does not appear in the arguments.

type _ ty = Int : int ty | Pair : ('a * 'a) ty
let example : type a. a ty -> a -> unit = function
  | Int -> succ
  | Pair -> (fun (x,y) -> (y,x))

There is no syntax to bind the variable in Pair currently.
But this is orthogonal to this PR.

goldfirere · 2023-09-22T13:56:50Z

For the case an existential variable does not appear in the arguments, GHC used to be able to do something that might translate to

type _ ty = Int : int ty | Pair : ('a * 'a) ty
let example : type a. a ty -> a -> unit = function
  | Int -> succ
  | Pair (type a) as p -> ignore (p : (a * a) ty); (fun (x, y) : a * a -> (y,x))

In Haskell, it was worse, requiring a second match against p.

Would the above syntax work here?

garrigue · 2023-09-25T00:09:46Z

With the approach in the current PR it wouldn't work, because it requires all named variables to be bound inside the pattern itself. Moreover it requires a syntactic annotation on the argument, but there is no argument here...
I was thinking of either:

| Pair (type a. (a * a) ty) -> ...

but this is really new syntax, or:

| (Pair (type a) : (a * a) ty) -> ...

but this requires a new interpretation of type annotations in that case.
By the way, one can already use beta-expansion, but this is what we are trying to avoid.

let example : type a. a ty -> a -> a = function
  | Int -> succ
  | Pair as p -> (fun (type a) (Pair : (a * a) ty) (x, y) -> (y,x)) p

gasche

I read the code as a non-expert. The change is relatively short/small but I don't understand what is going on enough to vouch for its correctness.

gasche · 2023-09-26T13:37:02Z

testsuite/tests/typing-gadts/name_existentials.ml

-       but a pattern was expected which matches values of type "a ty * a"
-       Type "b" is not compatible with type "a"
+Error: This type annotation attempts to bind "b"
+       to the already bound existential "a".


This example is still rejected, so I think the name should stay ko3 and not ok3.

gasche · 2023-09-26T13:38:43Z

testsuite/tests/typing-gadts/name_existentials.ml

                          ^
-Error: This type does not bind all existentials in the constructor: "type b. a"
+Error: This type annotation attempts to bind "b"
+       to the non-locally-abstract type "'a".


This error is confusing -- 'a does not occur in the user code. Could it be improved? Does the 'a come from the fact that b is not actually used in the pattern argument annotation? Maybe we can get a better message in this case.

I'm not sure how to track that b is not used other than by @alainfrisch 's logic for warnings, which I don't understand fully. Adding an extra level could work too, but this introduces new logic. Let me think about it.

I think you could actually check whether b is used in the pattern argument annotations without too much difficulty. For example, when you populate the initial environment with the name_list to get the ids, you could remember the identity of the flexible dummy variables for each id. Then on line 821 in the current PR, at the place where you raise the Bind_non_locally_abstract error, you can check whether the current identity of the type is still that flexible dummy variable, or if it was unified to something else. If it was not unified (it is still the dummy), you could raise a different error: the problem is not that the name is not being bound to a non-locally-abstract type, it is not being bound at all.

(I guess that this situation could occur in cases where the name does occur syntactically in the annotation, but those positions vanish during expansion. I don't think that this is a problem.)

gasche · 2023-09-26T13:40:43Z

testsuite/tests/typing-gadts/name_existentials.ml

+Error: This pattern matches values of type "b * (b -> 'a)"
+       but a pattern was expected which matches values of type "b * (b -> 'b)"
+       This instance of "'a" is ambiguous:
+       it would escape the scope of its equation


It would be nice to have a better error message in the escape situation -- which is bound to be a fairly common error among users, giving a local name to a type coming from outside.

I removed the scoping constraint on the type variable, as it is not necessary (it will be enforced later anyway). This way, we get the same error as below. The wording might be better though.

gasche · 2023-09-26T13:41:31Z

testsuite/tests/typing-gadts/name_existentials.ml

+[%%expect{|
+type _ ty = Int : int ty | Pair : 'b ty * 'c ty -> ('b * 'c) ty
+val example : 'a ty -> 'a = <fun>
+val example : 'a ty -> 'a = <fun>


These examples are nicer, simpler than the Thunk case above (which suffers from this corner case that the instance of the constructor is exactly the type parameter, rejecting more usage situations). Could it go above in the test file?

gasche · 2023-09-26T13:49:58Z

typing/typecore.ml

  in
  tps.tps_pattern_force <- force :: tps.tps_pattern_force;
+  (* Only unify the return type after generating the ids *)
+  unify_res ();


I suspect that the "scope escape" type error is coming from here, and that it would be nice to catch it here to send the more specialized Bind_non_locally_abstract error instead.

I removed the scope error that was occurring in the test suite.

gasche · 2023-09-26T14:03:07Z

typing/typecore.ml

+      let reason = match reason with
+      | Bind_already_bound -> "already bound existential"
+      | Bind_not_in_scope -> "previously defined existential"
+      | Bind_non_locally_abstract -> "non-locally-abstract type"


I think that this error message could be more user-friendly. This corresponds to the common case where the user tries to bind an "old" type. "non-locally-abstract" is a correct description but it will be confusing for users.

For example I wouldn't mind an error message saying something like:

The local name ... can only be given to an existential variable bound to this GADT constructor. The type annotation tries to bind it to the name ... defined outside this constructor.

Done as suggested

gasche · 2023-09-26T14:03:17Z

typing/typecore.ml

+      in
+      Location.errorf ~loc
+        "@[<hov0>This type annotation attempts to bind@ \"%s\"@ %s%s@ \"%a\".@]"
+        (Ident.name id) "to the " reason Printtyp.type_expr ty


@Octachron changed the way error messages are supposed to refer to inline code (including inline types), see the occurrences of Style.inline_code (Ident.name id) and (Style.as_inline_code Printtype.type_expr) ty in the rest of the file. You should match that new style instead of using manual quoting here.

gasche · 2023-09-26T14:14:40Z

typing/typecore.ml

    let ids = List.map (fun x -> x.txt) ids in
    let rem =
+      (* First process the existentials introduced by this constructor.
+         Just need to make their definitions abstract. *)


Naive question: why do we need to do two passes, one on the constructor existential instances and another on the bound existential names? Couldn't we do a single pass on the bound existential names, with logic that would also work well on the purely existential instances?

While this would be possible, this would be less robust. Namely, for internal existentials we delay the creation of the abstract type, so that we will never show a strange internal name for the user. For existentials that were introduced by unification, we cannot do that, so we just create alias names, hoping that the strange internal name will not show up later.

gasche · 2023-09-26T14:20:54Z

typing/typecore.ml

+        let env =
+          Env.add_type ~check:false id
+            {decl with type_manifest = Some (correct_levels tv')} !!penv
+        in


The logic between the two loops (on existential instances, then on remaining user-defined names) is fairly different: in one case you create a new abstract (for the user-provided type name) and you unify, and in the other case you define the user-provided name as a synonym to the typer-generated existential. Can you explain why there is this difference? Will it make a difference for error messages? (In particular, for not-purely-existential names, are they shown in error messages in the right-hand-side of the clause with the name given by the user?)

As I explained above, in the latter case we do not replace the original name by the one given by the user, but just define an alias, so we cannot guarantee that the internal name will never show up. In short-path mode we could tune the heuristics to show the user-defined name.

gasche · 2023-09-26T14:23:38Z

testsuite/tests/typing-gadts/name_existentials.ml

+                                      ^^^^^^^^^^^
+Error: This type annotation attempts to bind "b"
+       to the previously defined existential "$0".
+|}]


I would be curious to see an example with a type error in the right-hand-side of the pattern clause, to check that the types mentioned there are the one given explicitly by the user. For example what do we get with the following?

let rec example : type a . a ty -> a = function | Int -> 0 | Pair (type b c) (x, y : b ty * c ty) -> (example x, example (*error*)x)

I have added the example, and indeed this does not work well:

3 | | Pair (type b c) (x, y : b ty * c ty) -> (example x, example (*error*)x) ^^^^^^^^^^^^^^^^^^ Error: This expression has type "b" = "$0" but an expression was expected of type "$1"

I also checked using -short-paths, and indeed this gives the message you wanted.

3 | | Pair (type b c) (x, y : b ty * c ty) -> (example x, example (*error*)x);; ^^^^^^^^^^^^^^^^^^ Error: This expression has type "b" but an expression was expected of type "c"

gasche · 2023-10-02T08:18:43Z

typing/typecore.ml

@@ -778,21 +791,66 @@ let solve_constructor_annotation
        | _ -> assert false
  in
  if ids <> [] then ignore begin


I wonder why this ignore is here and if it is still necessary.

Indeed it was spurious; removed it.

gasche · 2023-10-02T08:27:34Z

testsuite/tests/typing-gadts/name_existentials.ml

                          ^
-Error: This type does not bind all existentials in the constructor: "type b. a"
+Error: This type annotation attempts to bind "b"
+       to the non-locally-abstract type "'a".


I think you could actually check whether b is used in the pattern argument annotations without too much difficulty. For example, when you populate the initial environment with the name_list to get the ids, you could remember the identity of the flexible dummy variables for each id. Then on line 821 in the current PR, at the place where you raise the Bind_non_locally_abstract error, you can check whether the current identity of the type is still that flexible dummy variable, or if it was unified to something else. If it was not unified (it is still the dummy), you could raise a different error: the problem is not that the name is not being bound to a non-locally-abstract type, it is not being bound at all.

(I guess that this situation could occur in cases where the name does occur syntactically in the annotation, but those positions vanish during expansion. I don't think that this is a problem.)

garrigue · 2023-10-05T04:57:47Z

(For some reason GitHub does not allow discussing an outdated chunk)

you can check whether the current identity of the type is still that flexible dummy variable, or if it was unified to something else.

Unfortunately, unification guarantees that types will become equal, but it does not say anything about which one will be redirected to the other. So it could very well be the case that we have unified with a variable that was redirected to point to the newly introduced abstract type, which cannot be detected if that variable had the same level (or higher). This can happen for instance with code like

match [] with
| (Pair (type a b) (x, y : a ty * b ty)) :: _ -> ()
| None -> ()

This said, one might consider that this code is too meaningless for caring about this false positive.

gasche

I can tell that this is broadly reasonable, but I don't understand the details (and there are places where I don't understand why it cannot be simpler). I don't think that I will be able to understand the code much better in a reasonable amount of time, and I think that the feature is important and well-tested, so I propose to move forward by approving.

gasche self-requested a review August 28, 2023 07:06

Octachron reviewed Sep 1, 2023

View reviewed changes

garrigue force-pushed the name_closed_types branch from 9f9235c to 06811d3 Compare September 21, 2023 01:37

gasche mentioned this pull request Jul 26, 2023

tests/weak-ephe-final/finaliser_handover is flaky #12345

Closed

gasche reviewed Sep 26, 2023

View reviewed changes

gasche reviewed Oct 2, 2023

View reviewed changes

gasche approved these changes Dec 18, 2023

View reviewed changes

garrigue added 11 commits January 19, 2024 12:06

Allow to name closed types in constructor type annotations

8cb4b73

add example for non-closed type

cf34c6a

changes

0a210cc

proper error message for non-closed type

641481c

add comments

245997f

update manual

ad0a5b1

singular

c8b835a

typo

64abc36

clean up abbreviations after modifying penv

0b39321

restrict naming to locally introduced existentials

afc1d4c

update manual

487d439

garrigue added 9 commits January 19, 2024 12:14

add counter-example for pre-existing existentials

40675d9

simplify and add comment about Ident.compare

039aa44

normalize

a6f577c

add Ident.compare_stamp

0027dca

rename failing test

fe54ba2

avoid spurious ambiguity error

2982248

reorder examples

162996e

improve error message as suggested

32a1e1f

add example of error in right hand side

1b5dca9

garrigue force-pushed the name_closed_types branch from a6d6a30 to 1b5dca9 Compare January 19, 2024 03:20

garrigue added 2 commits January 19, 2024 12:52

follow suggestions

ff5ae15

change log to match behavior

8ef78d7

garrigue merged commit c7ad1a1 into ocaml:trunk Jan 19, 2024

Octachron mentioned this pull request Jun 30, 2025

"named existentials" in GADT patterns should support naming all type variables used in the constructor declaration #11891

Closed

Conversation

garrigue commented Aug 28, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lpw25 commented Sep 4, 2023

Uh oh!

gasche commented Sep 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

garrigue commented Sep 21, 2023

Uh oh!

garrigue commented Sep 21, 2023

Uh oh!

garrigue commented Sep 21, 2023

Uh oh!

gasche commented Sep 21, 2023

Uh oh!

lpw25 commented Sep 21, 2023

Uh oh!

gasche commented Sep 21, 2023

Uh oh!

garrigue commented Sep 22, 2023

Uh oh!

goldfirere commented Sep 22, 2023

Uh oh!

garrigue commented Sep 25, 2023

Uh oh!

gasche left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

garrigue Oct 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

garrigue Oct 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

gasche commented Sep 20, 2023 •

edited

Loading

garrigue Oct 2, 2023 •

edited

Loading

garrigue Oct 2, 2023 •

edited

Loading