Skip to content

Use 4.11 AST#1387

Merged
emillon merged 2 commits intoocaml-ppx:masterfrom
kit-ty-kate:411
Jul 1, 2020
Merged

Use 4.11 AST#1387
emillon merged 2 commits intoocaml-ppx:masterfrom
kit-ty-kate:411

Conversation

@kit-ty-kate
Copy link
Copy Markdown
Contributor

This PR should add support for OCaml 4.11 to ocamlformat.
At the moment the tests fail during testing (reformat_string.ml gets its comments removed), however it seems to be fixed by #1383.

I'm not sure how critical the location fields are to ocamlformat so I'll let you decide whether the loc and Location.none are correctly placed here.

Concerning the deprecation of Longident.parse I can try to use Parse.longident instead, however the semantics are different between the two functions, also I'm not sure how to get it without accessing to compiler-libs directly as ocaml-migrate-parsetree does not have the function.

@jberdine
Copy link
Copy Markdown
Collaborator

jberdine commented Jun 4, 2020

It would be good to check that Cmts.Loc_tree.of_ast is implemented in such a way that the newly added locations in the 4.11 parsetree make into the location tree. Maybe nothing needs to be done, but I'm not sure and it would be good to check.

| Pconst_string (s, _, Some delim) ->
wrap_k (str ("{" ^ delim ^ "|")) (str ("|" ^ delim ^ "}")) (str s)
| Pconst_string (s, None) -> (
| Pconst_string (s, _, None) -> (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One need to look for comments using the currently discarded locations

@emillon
Copy link
Copy Markdown
Collaborator

emillon commented Jun 24, 2020

FTR, I suggested that the new location added to Pconst_string could replace the tokens_at part in Source.string_literal. That's true, but with a twist: the location in the token refers to the string with delimiters (always double quotes, since delimited strings are raw), while the location in the AST only refers to the payload. So if we want to use this location (which requires a 4.11 parser), we'll need to adjust the literal lexer to skip the enclosing quotes.

@emillon
Copy link
Copy Markdown
Collaborator

emillon commented Jun 25, 2020

About comments attached to the new location: my understanding is that there is always a strict inclusion between the two locs:

Consider a string literal as an expression, let's destructure it as {pexp_loc = loc_expr; pexp_descr = Pconst_string (_, loc_const, _)}.

The locs look like this:

double quoted string

"hello"
 ^^^^^  loc_const
^^^^^^^ loc_expr

raw string

{delim|hello|delim}
       ^^^^^        loc_const
^^^^^^^^^^^^^^^^^^^ loc_expr

quoted extension

{%f|hello|}
    ^^^^^   loc_const
^^^^^^^^^^^ loc_expr

quoted extension with delimiter

{%f delim|hello|delim}
          ^^^^^        loc_const
^^^^^^^^^^^^^^^^^^^^^^ loc_expr

There is no lexical way that I know of to put comments within loc_expr (in the %f delim case, it's possible to insert whitespace, but no comments). I'm not super familiar with the comment attachment algorithm, but I think that means that nothing will get attached to loc_const. So an assertion that the loc tree is empty at the end of formatting (which is already in place?) should be enough.

Does that seem correct @hhugo @gpetiot @jberdine ?

@jberdine
Copy link
Copy Markdown
Collaborator

I expect that by and large the comment placement behavior won't be affected due to the strict nesting. However, uses of Cmts.is_adjacent could care. It would be good to check those and see whether that will matter, and decide whether it is better to adjust is_adjacent to take delimiters into account, or to ensure that the string locations include them.

@emillon emillon changed the title Support OCaml 4.11 Use 4.11 AST Jul 1, 2020
@emillon
Copy link
Copy Markdown
Collaborator

emillon commented Jul 1, 2020

I think this should be OK, we can refine that if necessary when we add tests for 4.11 features. So this seems ready to get merged in my eyes. What do you think @gpetiot ?

@emillon emillon added the no changelog set this to bypass the CI check for changelog entries label Jul 1, 2020
@emillon
Copy link
Copy Markdown
Collaborator

emillon commented Jul 1, 2020

(we'll add changelog entries for explicit 4.11 features)

@gpetiot
Copy link
Copy Markdown
Collaborator

gpetiot commented Jul 1, 2020

Is there a way to integrate #1383 in this ? i.e. use the loc of constants from the ast for ocaml >= 4.11 and use the functions of #1383 for ocaml < 4.11 ?

@emillon
Copy link
Copy Markdown
Collaborator

emillon commented Jul 1, 2020

I don't think so. These new locs are not the location of the constants themselves, so no comments should get attached to them. The only place we can leverage the new locs is for re-lexing the payload.

@emillon emillon added this to the 4.11 support milestone Jul 1, 2020
Copy link
Copy Markdown
Collaborator

@gpetiot gpetiot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@emillon emillon merged commit 41496b4 into ocaml-ppx:master Jul 1, 2020
@jberdine
Copy link
Copy Markdown
Collaborator

jberdine commented Jul 1, 2020 via email

@emillon
Copy link
Copy Markdown
Collaborator

emillon commented Jul 1, 2020

They are the loc for string contents (within delimiters), highlighted as loc_const in #1387 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog set this to bypass the CI check for changelog entries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants