Skip to content

Revert option doc-comments-val and restore the values of 0.13.0#1354

Closed
gpetiot wants to merge 2 commits intoocaml-ppx:masterfrom
gpetiot:revert-doc-comments-val
Closed

Revert option doc-comments-val and restore the values of 0.13.0#1354
gpetiot wants to merge 2 commits intoocaml-ppx:masterfrom
gpetiot:revert-doc-comments-val

Conversation

@gpetiot
Copy link
Copy Markdown
Collaborator

@gpetiot gpetiot commented Apr 24, 2020

I tested the reformatting on some of main ocaml projects:

  • mirage
  • irmin
  • index
  • mdx
  • dune
  • owl
  • tezos
    and there is not diff with the formatted output of Add doc-comments-val=unset #1340 (which has already been validated by maintainers of these projects)

Feel free to make more tests before merging

reformat

update tests

revert value in ocamlformat profile
@gpetiot gpetiot requested review from Julow and emillon and removed request for Julow April 24, 2020 10:01
@gpetiot gpetiot changed the title Revert option doc-comments-val to not confuse users Revert option doc-comments-val and restore the values of 0.13.0 Apr 24, 2020
Copy link
Copy Markdown
Collaborator

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Two projects I know of did use this option:

@jberdine
Copy link
Copy Markdown
Collaborator

Note that this reintroduces the regression from 82da6c5 that makes it no longer possible to have docstrings before types and after values. This regression fix is one of the recent fixes that we need in order to be able to use a mainline version of ocamlformat. See also the revert of this in the branch that we currently need to use: e7acc28.

What is the plan (and timeline?) for restoring the fix for not having docstrings before types and after values?

@gpetiot
Copy link
Copy Markdown
Collaborator Author

gpetiot commented Apr 24, 2020

I don't follow your comment, was #746 bad? This PR shouldn't have any impact on it.

@jberdine
Copy link
Copy Markdown
Collaborator

I think that "bad" might not be the right term, but it did make it impossible to have the docstrings (in signatures at least) for types come before and those for values come after. The doc-comments-val option made that possible again (not exactly the same output as before #746, but close enough for us to use). I just tested with this PR, and we're back to needing to at least partially revert #746. I would really like us to be able to use a mainline version, and the doc-comments-val change made me optimistic, so I'm wondering what the trajectory is for doc comments.

@gpetiot
Copy link
Copy Markdown
Collaborator Author

gpetiot commented Apr 24, 2020

The trajectory for doc-comments has not been defined yet.

doc-comments-val may have been a step in the right direction but it leaves a taste of "unfinished business" and the release of 0.14.0 left a lot of users surprised with the new behavior, precisely because we didn't define the strategy yet.
And 0.14.1 keeps the option, but it's still not very clear (and just improving the doc is not enough), and it is still subject to being modified in further release once we have defined what we need what to do with it.

So the short-term solution we chose is to release a 0.14.2 that has the exact behavior as 0.13.0 to not lose all of our users for this release cycle. We will start clean and can think of a new strategy for doc-comments in the 0.15.* release cycle.

@Julow
Copy link
Copy Markdown
Collaborator

Julow commented Apr 24, 2020

Note that #1340 (+ the revert of default released in 0.14.1) was enough to restore 0.13.0 behavior.

@jberdine
Copy link
Copy Markdown
Collaborator

Thanks @Julow, I was confused as I thought that unset and revert of defaults meant that 0.14.1 was usable with little churn for people on 0.13. Do I understand correctly that the situation is that the state of doc-comments-val in 0.14.1 is unsatisfactory mostly due to seeming unfinished, and that the goal with this PR is basically to remove it while a currently-unknown something else is designed to replace it?

If that understanding is correct, then while I appreciate the goal to have a small and clean interface, it is also important and useful to have essentially unfinished functionality in cases where what the right thing to do is unclear and needs experience and feedback. From that perspective, I'm wary that removing doc-comments-val entirely would be an overreaction to the feedback on 0.14.0. In particular, now that there is an unset value for it, it seems better to keep the option but not use it by default, so that it is there for willing users to experiment with so that more data on how to solve the overall doc-comments placement issue can be gathered. If it is not used in the profiles and is documented as being experimental, is seems better to keep it than to remove it.

Separately, is there a summary of the overall constraints, feedback, etc. for the design of doc comments placement that I can read somewhere? It seems like a quite non-obvious problem, and maybe something new will strike me if I look at it with fresh eyes.

@gpetiot
Copy link
Copy Markdown
Collaborator Author

gpetiot commented Apr 25, 2020

while a currently-unknown something else is designed to replace it?

The thing is currently we don't know what is going to replace it, and we don't want to make something in a hurry. So we are stuck in a state where we either:

  • let 0.14.1 and people will get accustomed to this new option and may complain that we remove it in a future release (even if we warn people) and may refuse to upgrade unless we keep it
  • make another release where the 0.13.0 statu-quo is restored, which only postpone the issue until 0.15 or later

And the more time we take to decide, the more this choice disappears and the first solution is silently adopted.

If that understanding is correct, then while I appreciate the goal to have a small and clean interface, it is also important and useful to have essentially unfinished functionality in cases where what the right thing to do is unclear and needs experience and feedback. From that perspective, I'm wary that removing doc-comments-val entirely would be an overreaction to the feedback on 0.14.0. In particular, now that there is an unset value for it, it seems better to keep the option but not use it by default, so that it is there for willing users to experiment with so that more data on how to solve the overall doc-comments placement issue can be gathered. If it is not used in the profiles and is documented as being experimental, is seems better to keep it than to remove it.

We are worried about sending the message of "you have this new option for now, maybe it will disappear in the next release, we don't know", but I agree with your point.

The thing is we were aiming at reducing the number of options over the next releases. So of course I'm not saying we should just stop adding options all together but we need to be cautious. This option may also send the message that we may have a similar option for types, or modules, or something else, and no plan is defined yet. Is it what we want?

@jberdine
Copy link
Copy Markdown
Collaborator

Maybe it is helpful to summarize the history of these options:

  1. initially neither doc-comments nor doc-comments-val options existed, and the output was close to setting doc-comments=before and doc-comments-val=after.
  2. doc-comments was added, and it affected only val signature items.
  3. doc-comments was renamed doc-comments-val, and a new doc-comments option took it's place, which setting to after gives output with doc comments after as many syntactic forms as possible without potentially needing to add empty docstrings.

The reasoning behind the design of 1) was to place docstrings before most items, since it seemed that usually in practice, the docstrings for e.g. module or type definitions were written in such a way, and contained information such that, it was more helpful to read them before the definition. On the other hand, the docstrings of val items were usually easier to read after the type of the declared value, and so those were placed after. For instance, the docstrings of functions often refer to argument labels without duplicating them from the type. Another point is that val items are commonly short(er than their docstrings), and they do not usually contain docstrings themselves. So at stage 1), the implementation branched on the item's kind when deciding where to place the docstrings as a way to express the heuristic that was arrived at by looking at a body of code. It was never a perfect realization of the informal motivating design considerations, but it was predictable, easy to implement, and quite close in practice.

Then step 2) happened since the Jane Street code style mandated docstrings come before their items. The possible values of this option were, in hindsight, confusingly named from the beginning.

Then step 3) happened in response to feedback where some people wanted more docstrings to come after their associated items than just val items. And since this desire revealed the confusing naming of the existing doc-comments, it was renamed and a different option put in its place.

Now, there are two options controlling docstring placement, seeming to indicate that there are two orthogonal dimensions to the choice, when in fact the two options are only describing a number of different points along a single dimension. I think that we should consider what it would look like to collapse the doc-comments and doc-comments-val options into one:

doc-comments doc-comments-val (new) doc-comments
before before before
before after before-except-simple
before unset before
after before make this impossible
after after after
after unset after

What I'm proposing is to remove doc-comments-val and instead add a new value to doc-comments, say named before-except-simple. This would leave doc-comments with the values before and after indicating that all docstrings should be placed before, respectively after, their items, and before-except-simple indicating that the criterion described above should be used. Currently, after would be incompletely obeyed, since there isn't currently a mechanism to detect when placing a docstring after would introduce ambiguity, and to avoid it by adding an empty docstring to the other alternatives that cause the ambiguity.

My perception is that that situation would be reasonably clean. There would be one option, and all 3 values would have intuitive explanations. There is a point that the implementation of after is not completely finished, but should be fine as long as it isn't set by a profile, and is documented so that it is clear that the intent is to place all docstrings after, even if currently some still get placed before, and that future versions aspire to complete the implementation to get there.

Based on feedback and experience on other code bases, are those 3 styles enough? I don't know of code bases which use doc comments after their items in all cases, so I'm particularly unsure about whether that option is needed or the constraints coming from the cases where it would be desired. Are there recent results from the harvest script somewhere? I'm thinking that another possible plan would be to remove the doc-comments=after case until it is implemented fully. That corresponds roughly to reverting #746, but using updated names and documentation for clarity. But if it is used much, it would probably be better to document it's eventual goal and the differences between the current implementation and that objective. #746 is also pretty old, so it would definitely be less risky to leave the currently-incomplete implementation of doc-comments=after instead of removing it until its done.

Do you also think that that one option with three values would be a reasonable interface going forward? If so, I think that we could get to a good state very quickly, that could be done with one small PR and used for 0.14.2. Maybe I'm overlooking something?

Perhaps related, when looking at the current state of the documentation and implementation of doc-comments=after for structures, signatures, and objects, it makes me think that it might be worth trying to add a value similar to before-except-simple but that decides based on whether the associated item could (or does) have other docstring within it. For example:

(** docstring before since there are other docstrings in the type definition *)
type t = A of int (** doc of A *) | B of int (** doc of A *)

(** docstring before since there could be other docstrings in the type definition *)
type t = A of int | B of int

type t = int
(** docstring after since there can't be sub-docstrings *)

(** docstring before since there could be other docstrings in the struct *)
module M = struct
  type t = int
end

My thinking is that the behavior of such a value could be defined in an objective non-arbitrary way, and might be enough to remove the "on structures, signatures and objects for readability" exception for doc-comments=after. It also seems that such a criterion could be applied uniformly to implementations and interfaces, which would be nice. It would take some testing, but perhaps it could replace before-except-simple as well.

There might be some refinement of the criterion so that e.g. a code change from

type t = int

to

type t = Int of int

doesn't churn the docstring positions. I think that would amount to saying that the code deciding before vs after should only look at the signature item kind, and not dig deeper to e.g. see what the type declaration kind is. That seems quite reasonable to me at first glance.

Something else to keep in mind is that when an item has two docstrings, there isn't a reasonable choice: no matter what the options are set to, one must come before and one after. For example:

(** must come before *)
val f : int -> int
(** must come after *)

I'm not sure whether it would be helpful, but instead probably harmful, to the documentation to get into those details. But good to keep in mind.

@gpetiot
Copy link
Copy Markdown
Collaborator Author

gpetiot commented Apr 26, 2020

We don't have data for the usage of doc-comments-val but we can consider it's fresh enough for it's usage to be negligible.

I like the idea of before-except-simple but as you pointed out our definition of a simple item is not very stable and it can be surprising for the user to have the comments move when the associated item becomes more complicated, using specific language elements makes it more predictable for us, but is it more predictible for the user? I think having a criterion like "be a one-liner" is more understandable for the user (but more messy for us)

On a general note I would like to keep the values of this option on a single cursor (and thus avoid orthogonal options). Currently we have: (after should be renamed after-when-possible to better depict reality)

  |-------------------|------------------------------|
before               ...                     after-when-possible

we can indeed imagine having intermediate values like after-when-*-and-possible, to keep going with the "simple item" example we could have after-when-possible-and-one-liner-item (or similar), or any criterion that makes sense on this scale.

@jberdine
Copy link
Copy Markdown
Collaborator

I also think that it's important to try very hard to keep the option values just on a linear scale. It is much easier to maintain that way. Maybe we can't rule out ever needing something more, but it would be good to state that avoiding that is an objective/constraint.

I agree that renaming after to after-when-possible could be more understandable. It might be that there is some confusion about what "possible" means, since we mean what the implementation can do rather than what might be theoretically possible.

Predictability is important. I have a lot of experience with the situation where "simple" is just val (and external) items. In practice that works very well. It seems that the classification of "values" vs "types" vs "modules" is clear and natural to programmers, and that the val-or-external criterion is just an implementation of "values". In code review I have seen some surprise / mild grumbles about things like one-liner module declarations churning when they become multi-liners. At least in the code I have seen, code with docstrings on e.g. module and type language phrases is not so dense that people are unhappy when one-liners aren't treated specially. It would be possible for code to have many short type or module definitions that have docstrings, but I haven't seen it often/even.

My inclination would be to start with the smallest change from existing support, just adjusting the config option space, and then consider a situation supporting before-before-except-val-before-except-val-and-one-liners-after-when-possible, and then see if there is feedback that motivates making the implementation more complete to push after-when-possible to after, and also see if there is feedback that motivates and leads to clear definitions of additional points on the scale.

@gpetiot
Copy link
Copy Markdown
Collaborator Author

gpetiot commented Apr 27, 2020

I agree with your proposition, since the work looks short enough we can make it for a 0.14.2 in the short term. To be more concise and make sure we all agree on these points:

0.14.2:

  • doc-comments=before : unchanged
  • doc-comments=after : renamed doc-comments=after-when-possible, no change of behavior, but the "when possible" part has to be detailed in the doc
  • doc-comments-val removed
    • doc-comments-val=before : included in doc-comments=before
    • doc-comments-val=after : becomes doc-comments=before-except-val

0.15.0:

  • we can think of adding a doc-comments=before-except-val-and-one-liners if there are some user requests for it

Again having a 0.14.2 doesn't mean it's urgent, only that there is not enough changes to justify incrementing the middle version number.
What are your thoughts @Julow @emillon @samoht ?

@jberdine
Copy link
Copy Markdown
Collaborator

That summary and plan look good to me, thanks @gpetiot!

Another point to possibly add to 0.15 is to decide whether doc-comments=after is an eventual goal. This would require some non-trivial implementation work, so ought to be done, I think, only if there is demand. Perhaps we should just add a "request for comments" issue on github asking for feedback on whether the cases where after-when-possible places docstrings before are problematic. Basically it just seems that doc-comments=after-when-possible is somewhat ad hoc, and it would be good to have a plan for how to resolve that before 1.0. For instance, I'm not even sure if just removing it entirely is definitely not an option.

@gpetiot
Copy link
Copy Markdown
Collaborator Author

gpetiot commented May 4, 2020

Replaced with #1358

@gpetiot gpetiot closed this May 4, 2020
@gpetiot gpetiot deleted the revert-doc-comments-val branch May 4, 2020 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants