Skip to content

Add infix operators for function composition#2097

Closed
alainfrisch wants to merge 1 commit intoocaml:trunkfrom
alainfrisch:infix_function_composition_operators
Closed

Add infix operators for function composition#2097
alainfrisch wants to merge 1 commit intoocaml:trunkfrom
alainfrisch:infix_function_composition_operators

Conversation

@alainfrisch
Copy link
Copy Markdown
Contributor

@alainfrisch alainfrisch commented Oct 9, 2018

This PR extends Stdlib with infix operators for function composition:

val ( % ) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b
(** Function composition: [f % g] is equivalent to [fun x -> f (g x)]. *)

val ( %> ) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c
(** Function reverse-composition: [f %> g] is equivalent to [fun x -> g (f x)]. *)

[EDIT: now the operators are called %< and %>, and put in a submodule Stdlib.Fun.Ops]

This addition is not intended to particularly encourage point-free programming, but to support it for cases people find it useful and to provide standard names for operations which people would expect to find in the Stdlib.

The same functions (with those names) exist in:

F# and Elm expose those operators under different names (<< and >>).

The question about the existence of infix operators is regularly raised on SO:

In #2010, it is proposed to add a Bool.negate : ('a -> bool) -> ('a -> bool) to support point-free programming on "Boolean predicates". With the composition operators, Bool.negate pred could directly be written as (not % pred) or (pred %> not).

(@pierreweis expressed some concerns about composition operators, but this was 20 years ago, and I'm not sure the arguments hold.)

@mshinwell
Copy link
Copy Markdown
Contributor

I am not in favour, because experience shows that once such operators are added, they do get used and result in point-free code that is less readable and harder to refactor. (An argument to the converse might be that they get defined sufficiently frequently that they should be factored out into the stdlib, but I'm not sure if that is true.)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Well, many other language with similar syntaxes have those operators, many alternative ocaml stdlibs have them, and I can confirm I add it periodically in my own modules.

Let's trust users to decide on their own which style fits best their use cases.

(If I was in a better mood, I would start arguing that providing support for better debugging experience is dangerous since it will encourage users not to give enough thoughts to their code upfront.)

@mshinwell
Copy link
Copy Markdown
Contributor

I think when designing a library it's important to be at least reasonably opinionated, which will be reflected by some users feeling constrained, and others being guided into what are hopefully idiomatic code patterns. Exactly what stance should be taken with the stdlib I do not know, but we should maybe consider the issue a bit more.

(If you are concerned that we should not need to use debuggers, we should surely be striving for particularly legible code whose behaviour is obvious by inspection, no?)

@Drup
Copy link
Copy Markdown
Contributor

Drup commented Oct 9, 2018

@alainfrisch Well, my own concern is that the point regarding generalization is still pretty valid. On the other hand, this problem on its own is more than enough to doom pointfree style in OCaml, and users will have to learn about generalization at some point in their OCaml life, so it's not so problematic.

Copy link
Copy Markdown
Contributor

@dbuenzli dbuenzli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think it's a good idea to have a function composition operator in a functional programming language.

However one thing I think needs to be considered is if we really need two of them. I think one is sufficient, it will make code reading more regular and put less operator burden on the reader's mind.

I personally would only go with the ( % ) operator of the proposal which follows the f o g mathematical notation and the natural reading order of f (g x)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

I personally would only go with the ( % ) operator of the proposal which follows the f o g mathematical notation and the natural reading order of f (g x)

Postfix notation is common in some domains, for instance for applying syntactic substitutions as in tσ (https://en.wikipedia.org/wiki/Substitution_(logic) even says "Applying that substitution to a term t is written in postfix notation"). Moreover, the pipe operator |> has become quite popular (and I personally like it a lot), and it encourages a way of thinking of data transformation as going, syntactically, from input to output. If I had to pick one, it would rather be %>.

@alainfrisch alainfrisch force-pushed the infix_function_composition_operators branch from c2f807a to ce04834 Compare October 9, 2018 11:20
@ocaml ocaml deleted a comment from alainfrisch Oct 9, 2018
@pmetzger
Copy link
Copy Markdown
Member

pmetzger commented Oct 9, 2018

BTW, general note: I understand the desire not to encourage certain sorts of things, but if people are doing them anyway, let there at least be a standard notation for them to make it easier to read people's code when you come in cold. One can discourage the pointless, er, point-free style in other ways, including just plain explaining that it's not a good idea.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

including just plain explaining that it's not a good idea.

FWIW, I don't believe that point-free is a bad idea in all contexts. Would @dbuenzli 's proposal for Bool.negate in #2010 be considered as point-free programming? I think so, and I don't think it's a bad style. I've some good use cases in my code base, and the fact that other similar languages and alternative stdlibs offer the operators should be an indication that point-free style is not universally bad. Next step would be to add disclaimers on ( := ) because, uh, imperative style is not very OCaml-idiomatic; and then on raise because one doesn't want people to create spaghetti code relying on exception for control flows; etc.

@pmetzger
Copy link
Copy Markdown
Member

pmetzger commented Oct 9, 2018

FWIW, I don't believe that point-free is a bad idea in all contexts.

It clearly isn't a bad idea in all contexts.

@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Oct 9, 2018

Very happy to see this: point-free style can be very clarifying in the right places. I also agree with @dbuenzli that we should only have %. As far as I can tell, there's no reason to compose functions in 2 directions, and if we want to provide a guiding style, let's do so by clarifying a normative direction of composition. This is similar to having both >>= and <<= monadic operators, the latter of which tends to make reading monadic code much harder in my experience.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Existing stdlib alternatives expose the two functions, and so do F# and Elm. I don't think F#, Elm, Containers and Batteries are widely considered as encouraging people to write unreadable code.
Amusingly, I think that when people fear excesses of point-free style, they naturally think of Haskell, but AFAICT, Haskell is the exception here, as it exposes only one version, the backward-one (i.e. the last executed function comes first) in its standard library (although its controversial Flow library -- "Flow provides operators for writing more understandable Haskell." https://www.reddit.com/r/haskell/comments/324415/write_more_understandable_haskell_with_flow/ -- exposes both). So, empirical evidence does not seem to suggest that exposing both direction rather than only the backward one augments the risk of people writing unreadable code (quite the opposite).

Anyway, at least for the code I have in mind in my own code base, it would really feel weird to have only ( % ). I think I would actually rather write:

  f1
  |> (%) f2
  |> (%) f3

than:

 f3
 % f2
 % f1

because the former shows more clearly the sequencing of operations when reading the code in the natural direction (similarly to |>).
But I'd clearly prefer:

  f1
  %> f2
  %> f3

I can see the argument that ( % ) follow the most common (but non-universal) mathematical notation, and I'm certain they are certain kinds of combinator/point-free style libraries that would make good use of it.

So let's accept both version and trust people to not make a soup of unreadable code with them!

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 10, 2018

I also like having the two versions. In my experience they are useful in different places; depending on the domain one is more readable than the other. There are also domains of maths (typically category theory) where I often use the inversed-composition order (often written ;).

@dbuenzli
Copy link
Copy Markdown
Contributor

Let me put that this way: I prefer to have both of them rather than none or only %>.

Note that my concern was not about people writing unreadable point free code, my concern is always about how much definitions you need to keep in your mind when you read code (something I care about due to my own limitations). One last argument that was not made is that infix operators is a scarce ressource in OCaml and some space should be kept for end-users and their combinator libraries (counter of course is the M.() notation).

@alainfrisch should we move on to remove Bool.negate then ? Personally I'm perfectly fine with not % pred ?

@mshinwell
Copy link
Copy Markdown
Contributor

@dbuenzli I agree entirely about the number of definitions.

The problem with the composition operators is that you'll probably end up wanting both versions. However consider then what happens when you're reviewing a really important piece of code, which perhaps is of one of the forms in @alainfrisch 's examples above. What happens in my case is that I would want to be extra certain the composition was the one in the correct order, so I'd probably go and check the definition of the operator and do some more thinking, whereas the behaviour would have been immediately obvious if the operator had not been used in the first place.

Even if composition operators are added, I don't think functions that are useful in their absence should be removed.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

@alainfrisch should we move on to remove Bool.negate then ? Personally I'm perfectly fine with not % pred ?

Yes, I'd say so, if you're fine with not % pred. If the present PR is not accepted, you can always suggest adding back Bool.negate later.

However consider then what happens when you're reviewing a really important piece of code

It doesn't sound crazy to me to deliberately forbid the use of the new operators (or just one of them) in some code bases or some specific parts which you are in charge of reviewing. During internal code reviews, I regularly ask PR authors to avoid some features (such as imperative features, or even some operators such as @@) which are perfectly valid in some other contexts.

I've a lot of sympathy for arguments about the cognitive burden with operators, as I was initially quite reluctant to the use of |>, which I found useless ("what! we already have a nice syntax for function application"). But I gave it a chance, and now find it usually improves the readability of code where people decide to use it. I'm less fond of @@ and still occasionally protest against uses such as not @@ x (yes, with x being an identifier), but at least I know what @@ means and I prefer this over the situation were people would define it locally (which they would do).

@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Oct 10, 2018

The reasons for having both operators make sense to me now, and I completely agree that having blessed operators is better than having people define them ad hoc.

@yminsky
Copy link
Copy Markdown

yminsky commented Oct 10, 2018

I'm not 100% sure what the right answer is here. We've evaluated this question in Base/Core and decided not to include such operators, but we might of course be wrong.

But @@ is a good example of the pitfalls here. I think adding @@ has made things worse, because its presence in the standard library increases its use, and I think its use is largely pernicious. My worry is that % and %> will do the same.

Another thing one can do is to add a sub-module with these operators, so that people can open that module when they want it in their namespace. We've in fact done this internally as a way of sharing a standard choice about naming operators, without having them by default in the ambient environment.

@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Oct 11, 2018

I'm not sure why @@ is vilified so much. I like it.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

@gasche : I know how you hate to participate to stdlib discussions, but since you are the only maintainer who expressed a positive opinion on this PR, do you think you'd like to overrule other negative opinions and merge this PR? Otherwise we can close this PR and wait for someone else to propose the same thing (so that we will have two maintainers in favor). :-/

@garrigue
Copy link
Copy Markdown
Contributor

I'm not sure about the two-maintainer rule: it seems to me that any change to the language (including the standard library) should be given enough time to be sure that no problem (or better solution) has been overlooked. This of course depends a bit on how far we are in the release cycle.

Independently of that, I'm rather supportive of having composition operators (and both of them, as I do not see one being better than the other), for the sake of uniformity. Yet, I would follow @yminsky in suggesting new operators be in a submodule, so that one has to declare he is using them. Anything added to Pervasives (now Stdlib) has an impact on all programs. If it's fine to mildly break backwards compatibility, it would be a good idea to have @@ and |> there too. But maybe it's too late for that, and it would be strange to only put % and %> there.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 15, 2018

I'm happy to officially support the proposal and the choice of having two operators.

@dbuenzli and @mshinwell have expressed their preference for having only one operator, but I didn't sense a very strong position from their comments, more like a personal preference/recommendation, and I think we can overrule them.

I have no strong opinion on having a sub-module for the infix operators. I'm fine either way. On the other hand, I would be firmly against moving (@@) and (|>) there without keeping an alias in Pervasives/Stdlib: this would break a lot of code.

I looked at your implementation, and I don't like the 7/11 reference in the .mli comment (not only because of the risk of trademark dispute from the convenience store chain). Nobody knows what the numbers of precedence levels are, and I think that characterizing these levels by numbers would be a mistake in the first place (that OCaml currently avoids). One cannot understand this comment without looking at the precedence table, and if we are looking at it anyway the comment is useless.

I personally don't think we need to be explicit about documenting precedence (here or for other operators). We could refer readers to the manual section on OCaml lexical conventions (but we have to be careful to do this in a future-proof way by not giving (sub)section numbers). If we want to be explicit about precedence, I would just say that it is "at the same level as *".

Edit: we moved the discussion on precedence levels to a more appropriate thread.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

I'm not sure about the two-maintainer rule:

In case this what not obvious, my remark was a half-joke about the asymmetry that external contributions are sometimes easier to get in than PRs from maintainers. Of course there is no rush and we should give enough time to discuss these extensions.

new operators be in a submodule

I've no strong opinion here.

If we add some namespacing, what about adding a Stdlib.Fun "toplevel" module (at the same level e.g. as List, i.e. implemented in its own compilation unit), with Stdlib.Fun.Ops for operators? Stdlib.Fun could later be extended with other stuff (similar to https://github.com/c-cube/ocaml-containers/blob/master/src/core/CCFun.mli ).

If it's fine to mildly break backwards compatibility, it would be a good idea to have @@ and |> there too.

No, I don't think this is possible. The best one could do is mark those as being deprecated (and probably keep them forever, the cost of removing them will always be higher than keeping them).

@alainfrisch
Copy link
Copy Markdown
Contributor Author

I looked at your implementation, and I don't like the 7/11 reference in the .mli comment (not only because of the risk of trademark dispute from the convenience store chain).

This was added following a reviewer request so as to be coherent with existing docstrings in Stdlib. Are you suggesting we should remove all such references, or just not add them for the new operators discussed here?

@mshinwell
Copy link
Copy Markdown
Contributor

@gasche I didn't express a preference for only having one operator, I expressed a preference for having no operators. :)

If they must be introduced, I think I would prefer to see them in some separate namespace as @alainfrisch and @yminsky suggest.

@mshinwell
Copy link
Copy Markdown
Contributor

mshinwell commented Oct 15, 2018

(Also, I don't understand why the precedence of operators should not be documented precisely.)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

For reference, precedence information was added in #1167. I tend to agree with @gasche , but may I kindly suggest that we keep the discussion for elsewhere -- e.g. comments on #1167, or a new Mantis ticket, or new PR if a solution is clear enough. For the present PR, we should only aim at being consistent with existing declarations; migrating to a different documentation style later will not be more costly.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 15, 2018

Sorry, I missed that there was a GPR for this change. I'll go discuss that there and remove my previous message here. Thanks!

@alainfrisch
Copy link
Copy Markdown
Contributor Author

(Found the problem, I forgot to add the new files...)

@alainfrisch alainfrisch force-pushed the infix_function_composition_operators branch from 45ebe22 to 979a702 Compare October 23, 2018 10:33
stdlib/fun.mli Outdated

module Ops : sig
val ( %< ) : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b
(** Function composition: [f % g] is equivalent to [fun x -> f (g x)].
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think [f % g] should be [f %< g].

@alainfrisch alainfrisch force-pushed the infix_function_composition_operators branch from 6368df4 to 5673b30 Compare October 23, 2018 11:01
stdlib/fun.mli Outdated


val ( %> ) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c
(** Function reverse-composition: [f %> g] is equivalent to [fun x -> g (f x)].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis is failing because this line is over 80 columns.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. it would be nice to have a toplevel target like make pr-check and be documented here that runs most of these CI checks to avoid too many suffering round trips with the CI.

@alainfrisch alainfrisch force-pushed the infix_function_composition_operators branch from 5673b30 to 2047b14 Compare October 23, 2018 20:11
\end{tabular}
\end{latexonly}

\ifouthtml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two lists of modules follow this directive you should add the Fun module there too in alphabetic order (this is unfortunately not checked by the check manual script /cc @Octachron)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One should add a guide on how to add modules to the stdlib (e.g. in CONTRIBUTING.md, section "Contributing to the standard library"), since this requires advanced skills .

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there are some instructions here but they seem to be incomplete.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check could be updated, but an improved documentation sounds like a better first step. I will have a look.

@alainfrisch alainfrisch force-pushed the infix_function_composition_operators branch from 2047b14 to 239a3db Compare October 23, 2018 20:54
Copy link
Copy Markdown
Contributor

@dbuenzli dbuenzli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the PR looks correct.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Thanks @dbuenzli!

@garrigue @damiendoligez : I think you were the two maintainers opposed to the solution without the sub-module. Are you happy with the current state?

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Ok, we had a caml-devel meeting today, and there appears to be a rather widespread but silent support for having the operator in the default scope. (@xavierleroy mentions "operators in the default scope" as hi solution number 1, and "operators in a sub-module" as solution number 2 -- but of course he counts from 0). @damiendoligez does not seem completely closed at the idea of actually doing that, contrary to my interpretation of his reaction here, and @garrigue opposition also seemed not too strong.

So perhaps there is some hope to have the operators in the default scope. I'll let @damiendoligez make the final decision!

@dbuenzli
Copy link
Copy Markdown
Contributor

Thanks for the report(s) @alainfrisch. However now that you have done the work I'd still suggest to keep the Fun module. I can do the work of migrating protect there. I think there's consensus we should stop adding identifiers in Pervasives and the protect PR just did that.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Ok, to be clear, you're suggesting to move the operators to Stdlib, but still keep an empty Fun module in this PR, in order to make it easier to move protect to it just after; right?

@dbuenzli
Copy link
Copy Markdown
Contributor

Yes.

@mshinwell
Copy link
Copy Markdown
Contributor

@alainfrisch Since I wasn't at the meeting yesterday (not having expected topics such as this to be discussed), I'm going to restate my opposition to having the operators in the global scope. I honestly find your statement about "gratuitous humiliation" above a bit unhelpful. I don't think either myself or anyone else in favour of namespacing using submodules for operators is doing so to humiliate people. I am in favour of it, if we must have the operators, mainly because I have a genuine concern that these operators will otherwise become a de facto part of programming in OCaml. They can easily decrease legibility of code if applied indiscriminately, just like the current situation with @@, whose uses are rightly described above as often pernicious.

The argument that all operators must go in the global scope doesn't seem scalable to me, even notwithstanding the above. The only reasonable solution I can see for the long term is some kind of namespacing with users being able to choose what they want to use.

One other point about the namespacing is that if we start out with the operators in a submodule and there is then clamour from the community for them to be in the default scope, that can always be done, whereas the reverse is much more problematic.

What is @xavierleroy 's solution number zero?

@dbuenzli
Copy link
Copy Markdown
Contributor

just like the current situation with @@, whose uses are rightly described above as often pernicious.

@mshinwell This is a bit OT but as far as I read above nobody actually described these "pernicious" uses except for labelling them as such.

The only material things I read is @alainfrisch saying he was against gratuitous uses of it (something that also holds for |>, I sometimes see ridiculous let _ = x |> f in in the wild). Other than that there was just @diml saying he doesn't use it because he has no mnenomic for it (here are simple ones @t, @pplication, @pplied to).

I think judicious touches of @@ can greatly improve legibility to minimise nested parentheses and hereby allowing readers to better delineate the structure of expressions.

@ghost
Copy link
Copy Markdown

ghost commented Oct 25, 2018

@dbuenzli it's not really about the mnemonic. @ and @@ look very similar however they mean completely unrelated things. I find that non-intuitive. To me all these operators should be symetrical, so we should have |> and <|, %> and <%, etc. My understanding is that we can't do that because of the precedence rules.

@xavierleroy
Copy link
Copy Markdown
Contributor

What is @xavierleroy 's solution number zero?

Give up on this proposal and keep having no function composition operators in the standard library. The feature is not very useful and we're unable to reach consensus.

@damiendoligez
Copy link
Copy Markdown
Member

Give up on this proposal and keep having no function composition operators in the standard library. The feature is not very useful and we're unable to reach consensus.

This is also my preferred solution and @mshinwell's too.

@damiendoligez
Copy link
Copy Markdown
Member

@mshinwell

Since I wasn't at the meeting yesterday (not having expected topics such as this to be discussed)

This was actually discussed during lunch...

@alainfrisch
Copy link
Copy Markdown
Contributor Author

@damiendoligez As said, the final word is for you. Can you either close this PR, merge it in its current state, or say if you are ok with operators at the toplevel?

@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Oct 25, 2018

At this point, having converged on %> and %< (which I rather like), it would be a shame to not integrate this in some way. Not merging this PR won't remove the existence of a function as basic as composition in the wild, just as not having @@ didn't prevent low-precedence application from existing. Instead, people using these functions will make up their own unique symbols as they have been doing, making their codebases harder to read, and those using Batteries or Containers will keep using %, which I think is less clear than the solution we have here. All we'd be doing is showing yet again that the stdlib is incapable of meeting the needs of the community in this aspect.

@dbuenzli
Copy link
Copy Markdown
Contributor

To give credit to the idea that it might still be a good idea to have these names somewhere in the stdlib (to be clear I don't mind the open Fun.Ops), please consider the following story:

Long before @@ and |> were introduced I designed two DSL-based libraries, one of them (cmdliner) used by many in the eco-system which needed @@ and another one (vg) which needed |>. Since these operators didn't exist at the time I had to invent them. My inventions do not match what was subsequently chosen which I find to be regrettable for people who need to read code that use these libraries.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Ok, let's be realistic: no maintainer would click on Merge with Damien, Xavier and Mark being against the proposal. No need to make this last any longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.