Skip to content

forbid optional arguments reordering with -nolabels#9411

Merged
trefis merged 20 commits intoocaml:trunkfrom
trefis:type_args-merge-sargs
Apr 18, 2020
Merged

forbid optional arguments reordering with -nolabels#9411
trefis merged 20 commits intoocaml:trunkfrom
trefis:type_args-merge-sargs

Conversation

@trefis
Copy link
Copy Markdown
Contributor

@trefis trefis commented Apr 1, 2020

Foreword: IMHO this is a fairly low-key "cleanup" PR. I hope to send some more PRs of that sort (on that particular bit of code) in the coming days, so I would ask potential reviewers to not spend to long suggesting further changes / clean-ups / simplification to this part of the code.

Of course that's only a suggestion, and if people disagree then I can come back in a few days with a bigger (and likely harder to review) PR :)


If I understood correctly, the split between sargs and more_sargs (which I spent a while trying to find meaningful names for, to no avail) exists only to be able to raise an error when an argument gets commuted even though -nolabels was passed.
I believe this can only happen for optional parameters, as in this example:

let f x ?(y=2) z = x + y + z;;

let _ = f 3 4 ~y:1;;

It is still unclear to me why optional parameters are handled in that way in this mode. Note: I understand that unlike named parameters, they cannot be applied without a label. But I believe (perhaps wrongly) that the implementation could be more obvious in that regard.
Anyway, as I said, that's a discussion for another day!

@trefis trefis force-pushed the type_args-merge-sargs branch from 930166e to 7543050 Compare April 1, 2020 10:48
@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 1, 2020

It seems I had misunderstood what the current implementation does when -nolabel is given.
I believed that it was preventing arguments from being reordered during applications, and furthermore requiring optional arguments to be given with a label.

As it turns, this is not the case: optional arguments can be reordered, but only them. That is: you can swap an arbitrary number of adjacent optional parameters, as long as nothing else gets reorder.
For instance, given let f ?x ?y ?z () ?a ?b = x; y; z; a; b; (), then all of the following are fine:

let () = f ~x:() ~y:() ~z:() () ~a:() ~b:();; (* 1 *)

let () = f ~y:() ~x:() ~z:() () ~a:() ~b:();;(* 2 *)

let () = f ~x:() ~z:() ~y:() () ~a:() ~b:();; (* 3 *)

let () = f ~x:() ~y:() ~z:() () ~b:() ~a:();; (* 4 *)

But these, are not:

let () = f ~x:() ~y:() () ~z:() ~a:() ~b:();; (* 5 *)
(*                     ^^
   Error: The function applied to this argument has type
         ?z:'a -> unit -> ?a:'b -> ?b:'c -> unit
   This argument cannot be applied without label
*)

let () = f ~x:() ~y:() ~a:() () ~z:() ~b:();; (* 6 *)
(*                        ^^
   Error: The function applied to this argument has type
         ?z:'a -> unit -> ?a:'b -> ?b:'c -> unit
   This argument cannot be applied with label ~a
*)

let () = f ~x:() ~y:() ~z:() ~a:() () ~b:();; (* 7 *)
(*                              ^^
   Error: The function applied to this argument has type
         unit -> ?a:'a -> ?b:'b -> unit
   This argument cannot be applied with label ~a
*)

N.B. this is with 4.10.

The present PR partially breaks this behavior, it forbids optional arguments followed by a non-optional argument to be reordered.
That is, example 4 above is fine, but the following are not:

let () = f ~y:() ~x:() ~z:() () ~a:() ~b:();;(* 2 *)
let () = f ~x:() ~z:() ~y:() () ~a:() ~b:();;(* 3 *)
(*                           ^^
Error: The function applied to this argument has type
         ?x:'a -> ?y:'b -> ?z:'c -> unit -> ?a:'d -> ?b:'e -> unit
This argument cannot be applied without label)
*)

(furthermore, the error message is worse than it could be, but that should be fixable without too much trouble)


My intuition is that this a bug of the current implementation, and that it should be changed. To either always allow reordering arguments given with a label, or always forbid it.
There is of course no test for this behavior. I'm also going to make another wild guess in saying that: I don't think anyone is relying on it.

@garrigue: what do you think?

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Apr 1, 2020

Could I suggest a potentially simpler solution -- get rid of -nolabels?

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Apr 1, 2020

Could I suggest a potentially simpler solution -- get rid of -nolabels?

Isn't it used to compile the labeled modules in the stdlib ?

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Apr 1, 2020

I think it only saves us from having some files that looks like:

...
let foo ~x ~y z = foo x y z
...

that seems lower maintenance than -nolabels itself

@garrigue
Copy link
Copy Markdown
Contributor

garrigue commented Apr 1, 2020

-nolabels is about ignoring labels on non-optional arguments in unification, i.e. it works at the types level.
As such, I think that the current implementation is correct.
The precise semantics is described in my PPL 2001 paper: "Labeled and optional arguments for Objective Caml", as -classic mode.
It was introduced as a compatibility mode, when we attempted to add labels to the standard library, so its goal is not to be stricter than the -labels mode.
The addition of labels was not accepted in the end, and this is rather the -labels mode that was made more lax, by allowing full applications without labels (note that the above paper does not describe this relaxed mode).

I agree that almost nobody uses it nowadays, which actually means that people are comfortable with the current flexible label behaviour. That's good.
It has a role in the standard library. Note that to remove it, it is not enough to manually write eta-expansions in the standard library itself, but one would also have to do the inverse operation in tests/lib-stdlabels, which also relies on -nolabels to check that no function was forgotten.
It's not clear to me what is the cost of maintaining it. It's just about understanding what it means.
Another approach would be to deprecate it, by giving it a more private name, to be sure that people do not use it for normal programming.

Sorry not review your PR today, because I have a headache. But tell me when you want me to look at it.

@garrigue
Copy link
Copy Markdown
Contributor

garrigue commented Apr 2, 2020

Yet another option could be to keep the -nolabels mode, but with a different semantics, more oriented towards coercion of compatible types than usability. For instance, juste ignore all labels in types (including optional ones), and disallow them in function abstractions and applications.
That would be enough for what is used in stdlib (and possibly elsewhere), and avoid interactions with type_application.

@trefis trefis force-pushed the type_args-merge-sargs branch from 85ac2f6 to ab39d5d Compare April 14, 2020 13:58
@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 14, 2020

I was indeed wrong in assuming this was a bug, the implementation does match what is described in the paper, which I believe can be sumed up as follows:

When -nolabel is given:

  • named and anonymous arguments must be given in the correct order
  • named arguments can be given without a label
  • optional arguments must be given with a label
  • optional arguments can be commuted

This last point, which surprised me before, is an historical choice: it makes it a lot more convenient to use libraries like lablgtk where functions have a lot of optional arguments.
However, -nolabel doesn't appear to be very popular nowadays¹ , and these packages don't seem to be relying on that last behavior.

@garrigue and I talked offline, and I'd like to propose a change to the behavior of -nolabel: forbid optional arguments from being commuted.
This is less disruptive than simply removing -nolabel, and it brings us the same gains in terms of simplifying the implementation of type_application.
Indeed, -nolabel shares the same code path as full applications where all labels are omitted, so we would still need to retain that specific code path; which this proposal simplifies.

I have now updated this PR to:

  • add some more tests illustrating the current behavior
  • implement the proposed change
  • implement some further "low key cleanups" (where have I heard that before?), which shouldn't change the way things are typed or compiled.

1: grepping the source of all opam packages showed only a handful of packages building with -nolabels, all of which (but one) still compiled with the change proposed here. (the only one with a build failure is gen, and it requires a two lines change, to commute two optional arguments)

@let-def
Copy link
Copy Markdown
Contributor

let-def commented Apr 14, 2020

I reviewed the code. I believe it is simpler and correct (assuming we want the change in the behavior of optional labels in the -nolabel mode), but there too many subtleties to be 100% sure.

Success of the tests increases my confidence :).

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 15, 2020

the only one with a build failure is gen, and it requires a two lines change, to commute two optional arguments

Ping @c-cube btw, the diff is the following and can be applied now, regardless of whether this PR gets merged or not:

diff --git a/src/gen.ml b/src/gen.ml
index fbf6b0f..2540fd6 100644
--- a/src/gen.ml
+++ b/src/gen.ml
@@ -1902,7 +1902,7 @@ module Restart = struct
       match !cached with
       | Some mlist -> GenMList.to_gen mlist
       | None ->
-          let mlist = GenMList.of_gen_lazy ?caching ?max_chunk_size g in
+          let mlist = GenMList.of_gen_lazy ?max_chunk_size ?caching g in
           cached := Some mlist;
           GenMList.to_gen mlist
 end
@@ -1936,7 +1936,7 @@ let persistent gen =
 *)
 
 let persistent_lazy ?caching ?max_chunk_size gen =
-  let l = GenMList.of_gen_lazy ?caching ?max_chunk_size gen in
+  let l = GenMList.of_gen_lazy ?max_chunk_size ?caching gen in
   fun () -> GenMList.to_gen l
 
 (*$T

@alainfrisch
Copy link
Copy Markdown
Contributor

What about @garrigue's proposal of simplifying -nolabel (ignoring labels in types, and forbidding them in expressions)? Would it break a lot of code?

While we are at it, I was always a bit puzzled by the fact that it is allowed to drop all labelled arguments on a full application (in particular, I find it confusing that for a f : x:int -> y:int -> int -> int, both f 1 and f 1 2 3 are well typed, but not f 1 2). Is this feature used in practice? I guess this is intended to simplify the choice of adding labels to an existing API without breaking too much client code, but is that still a valid default behavior today? At least, shouldn't warning 6 be enabled by default?

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 15, 2020

What about @garrigue's proposal of simplifying -nolabel (ignoring labels in types, and forbidding them in expressions)? Would it break a lot of code?

In practice probably not much more than my proposal (there really aren't that many users of -nolabel on opam), if at all.
I don't think it would make the implementation noticeably easier to read, but it does make the semantics slightly simpler. I'm willing to try it out if people like it better.

While we are at it, I was always a bit puzzled by the fact that it is allowed to drop all labelled arguments on a full application [...]. Is this feature used in practice?

I have no idea. I have definitely seen it used in various random places. We would have to build all of opam to check.

At least, shouldn't warning 6 be enabled by default?

I think it should.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 15, 2020

I'm in favor of doing first the minimal simple change that you have already implemented and has been approved. Is there something left to do before merging?

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 15, 2020

On the implementation side no, I think we're good.

In terms of approval I would like @garrigue to confirm that this is reasonable and matches what we discussed.

c-cube added a commit to c-cube/gen that referenced this pull request Apr 15, 2020
@garrigue
Copy link
Copy Markdown
Contributor

One last check: what is the error message in gen ?
I.e., I agree that the breakage is minimal, but it should be self-explaining enough that one doesn't have to look at the Changes log to find an explanation.

Also, the paper cited above is basically the only reference which precisely explains how labels work in ocaml. It was already not 100% accurate because of the full-application rule which was added later, but this change requires another patch. High time to revise it, and put it somewhere visible enough.

All in all, I'm not sure rushing this in 4.11 is a good idea. On the other hand, I completely agree that this is a really minor change, which is not going to break anything serious.

I'll write a review after checking the code side-by-side.

@garrigue
Copy link
Copy Markdown
Contributor

@alainfrisch

While we are at it, I was always a bit puzzled by the fact that it is allowed to drop all labelled arguments on a full application (in particular, I find it confusing that for a f : x:int -> y:int -> int -> int, both f 1 and f 1 2 3 are well typed, but not f 1 2). Is this feature used in practice? I guess this is intended to simplify the choice of adding labels to an existing API without breaking too much client code, but is that still a valid default behavior today? At least, shouldn't warning 6 be enabled by default?

I tend to agree with you, as this gives a much cleaner semantics. However, we're not going to discuss this in this PR. You're free to open a new PR for that, as this is an important question.

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 17, 2020

One last check: what is the error message in gen ?

# File "src/gen.ml", line 1905, characters 44-51:
# 1905 |           let mlist = GenMList.of_gen_lazy ?caching ?max_chunk_size g in
#                                                    ^^^^^^^
# Error: The function applied to this argument has type
#          ?max_chunk_size:int ->
#          ?caching:bool -> 'a GenMList.gen -> 'a GenMList.t
# This argument cannot be applied with label ?caching

@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 17, 2020

All in all, I'm not sure rushing this in 4.11 is a good idea. On the other hand, I completely agree that this is a really minor change, which is not going to break anything serious.

Given that the only opam package broken by this has already been patched, I see no reason to delay the change.
I also doubt that delaying this by 6 months means that it will receive more testing: I don't know how many people test trunk regularly (rough estimate: not that many), but I'd be surprised if (m)any of them were using -nolabels.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 17, 2020

We can merge this as soon as it is ready. It was ready yesterday, but now we have to wait for Jacques' review :-)

Copy link
Copy Markdown
Contributor

@garrigue garrigue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a careful reading, it seems that the code is doing what you intend it to do (assume I understand correctly your intentions).
The only problems I found are an apparently wrong assumption (reflected in a variable name), but I don't think it has consequences, and a discrepancy with what I thought were your intentions originally w.r.t. -nolabels.

if optional && List.mem_assoc Nolabel sargs then
eliminate_optional_arg ()
else begin
(* No argument was given for this non-optional parameter, we
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What lets you think that this parameter in non-optional ?
The if-condition is a conjunction, i.e. we are here either because l is non-optional, or because it is optional but there is no unlabelled argument left in the application.
So the variable name is probably wrong (should be omitted_parameters).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks!
I pushed the change you suggested (and updated the comment).

(remaining_sargs, use_arg sarg l')
else if optional &&
not (List.exists (fun (l, _) -> name = label_name l)
remaining_sargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like another departure from the current semantics of -nolabels.
Currently:

# let f ?x ?y () = ();;
val f : ?x:'a -> ?y:'b -> unit -> unit = <fun>
# f ~y:3;;
- : ?x:'a -> unit -> unit = <fun>

I.e., you need a non-labeled argument to erase optional arguments.
But with this code, we would have:

# f ~y:3;;
- : unit -> unit

So we really need a new specification.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have made this more explicit, sorry. But this doesn't feel like "another departure" to me, I see it as a result from the fact that optional arguments cannot be commuted anymore.

@garrigue
Copy link
Copy Markdown
Contributor

garrigue commented Apr 17, 2020

OK, just to clarify.
The original behaviour of optional arguments in -nolabels was the same as with -labels, except that as one cannot distinguish between (non-optional) labels and unlabelled arguments, it required optional arguments to come in groups. There was an explicit semantics.
The new behaviour is that an optional argument is discarded as soon as an argument with a different label (or no label) is applied at its position.
While this new behaviour kind of makes sense, and has the advantage of maintaining some amount of compatibility (i.e., the breakage in gen is easy to fix), this is a new semantics.
I am concerned with changing the semantics in 4.11 without further discussion.
If there is a consensus that this is the desired behaviour, I'm fine (and the correction I requested is only changing a variable name), but are we sure that we all agree on what are the semantics of this change (as they are not specified explicitly), and that we can have a consensus of everybody concerned over the week-end.

@garrigue
Copy link
Copy Markdown
Contributor

One last check: what is the error message in gen ?

# File "src/gen.ml", line 1905, characters 44-51:
# 1905 |           let mlist = GenMList.of_gen_lazy ?caching ?max_chunk_size g in
#                                                    ^^^^^^^
# Error: The function applied to this argument has type
#          ?max_chunk_size:int ->
#          ?caching:bool -> 'a GenMList.gen -> 'a GenMList.t
# This argument cannot be applied with label ?caching

Reading the message, and re-reading the code, I realize that the semantics is even more involved.
I.e., it is prohibited to discard a parameter if it appears later in the arguments.
I suppose this is an extra security to avoid changing the behaviour on existing code, but it makes it harder to define a simple semantics.
Also, logically the error message should explain the problem.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 17, 2020

My understanding was that you (@garrigue) and @trefis had discussed this offline and that you both agreed that the new semantics is a good idea. Given that no one cares about -nolabels anymore, personally I would have thought that it is good enough. If you do not like the new semantics, or if you want to consult a lot more people than whoever is already in this PR, then of course there is no time to do this before 4.11 and you should not rush things. It's your call.

@garrigue
Copy link
Copy Markdown
Contributor

There may have been a misunderstanding. I have indeed said that I am open to a change in semantics, as -nolabels is very rarely used in a meaningful way. But I was not expecting a short time decision, and I had not looked at the details of the code yet. The code is fine, but it showed me that my understanding of this semantics was only partial.

@garrigue
Copy link
Copy Markdown
Contributor

Or I am just too nervous, and it can be proven easily that the new behaviour is only a restriction of the current one?
In this case, it might be fine to be less careful.

@garrigue
Copy link
Copy Markdown
Contributor

Well, no. It would still require that the -nolabels code check for the presence of an unlabelled argument, which it doesn't. Sorry for thinking aloud.

@trefis trefis force-pushed the type_args-merge-sargs branch from ab39d5d to d981fc7 Compare April 17, 2020 10:06
@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 17, 2020

Let's leave @garrigue, who is the author and caretaker of this code, to make further decisions on this PR. I suspect that my comments have only increased the noise in decision-making, and I'll unsubscribe now.

@trefis trefis force-pushed the type_args-merge-sargs branch from d981fc7 to 4bb63de Compare April 17, 2020 16:15
@trefis
Copy link
Copy Markdown
Contributor Author

trefis commented Apr 17, 2020

@gasche: I agree that @garrigue should probably have the last word on this topic, but I think it's good (and helpful) that people (Alain and you) spontaneously voiced an opinion on this topic.
Anyway.

@garrigue and I chatted again, and we reached the following conclusion:

  • while the behavior of the compiler with this PR was "reasonable" on the example he gave above, it means that this PR is change of semantics, not simply a restricted version of the previous one: his example is accepted with both versions, at a different type.
    So we agreed to change that, and restrict even further: optional arguments cannot be commuted with -nolabels, and can only be eliminated if some unlabelled arguments remain. That change means that his example is now rejected.
    I added the example in a preliminary commit at the beginning of the patch series, and then implemented the restriction.
  • in situations where the program would typecheck in -nolabels if optional arguments were allowed to commute, we should enrich the error message to explain that the compiler is stricter since 4.11.
    The last commit of the series does that.
    I'm not a big fan of the message, and it is sometimes displayed in cases that were already rejected before (i.e. cases were the optional argument would need to commute with a non-optional one). But I would say that's acceptable.

@trefis trefis changed the title type_args: merge sargs and more_sargs forbid optional arguments reordering with -nolabels Apr 17, 2020
Copy link
Copy Markdown
Contributor

@garrigue garrigue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now convinced that this is now just a restriction of the previous behaviour.
The author of the single incompatible code available on opam is informed, and the fix is trivial.
LGTM.

Error: The function applied to this argument has type
?x:'a -> a:'b -> ?y:'c -> z:'d -> unit
This argument cannot be applied with label ?y
Since OCaml 4.11, optional arguments do not commute when -nolabels is given
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message makes everything clear.

?x:'a -> ?y:'b -> unit -> unit
This argument cannot be applied with label ~y
Since OCaml 4.11, optional arguments do not commute when -nolabels is given
|}]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this incoherence is fixed. We have now a pure restriction.

@trefis trefis force-pushed the type_args-merge-sargs branch from f7edce1 to a321395 Compare April 18, 2020 09:09
@trefis trefis merged commit 0d92d18 into ocaml:trunk Apr 18, 2020
@trefis trefis deleted the type_args-merge-sargs branch April 18, 2020 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants