Skip to content

Allow arbitrary structure items in let expressions#14040

Merged
nojb merged 21 commits intoocaml:trunkfrom
nojb:generalize_local_structure_items
Dec 2, 2025
Merged

Allow arbitrary structure items in let expressions#14040
nojb merged 21 commits intoocaml:trunkfrom
nojb:generalize_local_structure_items

Conversation

@nojb
Copy link
Copy Markdown
Contributor

@nojb nojb commented May 16, 2025

This PR contains the last bits needed to complete the plan exposed in #13835. Since #13839, the compiler is technically able to handle arbitrary structure items in let expressions. Here, we expose this possibility to the user by extending the language accepted by the parser.

After this PR, all kinds of structure items are allowed in let expressions, with the following two exceptions:

  • let bindings (to avoid the seemingly strange syntax let let x = 12 in ...)
  • include (as far as I can see, let include would be a less efficient equivalent to let open, and inviting confusion between the two).

The manual has been updated. I am planning to add some tests shortly.

@nojb nojb force-pushed the generalize_local_structure_items branch from 9533714 to 983ec87 Compare May 16, 2025 12:27
Copy link
Copy Markdown
Contributor

@v-gb v-gb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change straightforwardly does what it says. I'm certainly in favor.

It'd be slightly better to have an example of syntax for each new local structure item, perhaps in testsuite/tests/exotic-syntax. Probably not strictly necessary, but it'd help make sure there's no "harmless" shift/reduce syntax conflict that prevents some construct from actually parsing, which I'm pretty sure I've seen happen before.

I suppose it's been discussed in past PRs, but I haven't seen such discussions, so feel free to ignore this: to me, the name Pexp_letitem seems more consistent with the naming convention of the AST. Pexp_struct_item would be expected to have a payload of struct_item.

Finally, just some things I wondering about while thinking about this change, you can skip.

In principle, Pexp_seq (e1, e2) could be replaced by Pexp_struct_item (Pstr_eval e1, e2) and Pexp_let (bindings, e) by Pexp_struct_item (Pstr_value bindings, e), if the latter constructions weren't forbidden by an invariant. But things are good as they are, things can be refactor at any time if desired.

Although expressions are structure items, let print_string "a" in ... is not accepted (it wasn't clear from the description, and I was wondering since let%foo print_string "a" in .. would be better than [%foo e1; e2] or (yuck) print_string "a" ;%foo e2).

While trying things out in the toplevel, I tried to create a recursive module let rec module M = ... in, which doesn't work. Instead you have to say let module rec M ... in .... Not what I'd prefer to write, but obviously I understand why, and also who cares since recursive modules are very rare. Still, I wondered why we have these extra let: if we allowed let f () = module rec M = ... in (), we'd get shift/reduce conflict on (module Foo :: is it a first class module or a local module? It's ambiguous at this point in the syntax. Now maybe it's possible to restructure the grammar to avoid this, but that's too much effort to try.

The similarity between fun (type t) -> .. and let type t in ... is a bit unfortunate, but oh well.

Comment on lines +1397 to +1399
\subsubsection*{sss:expr-struct-item}{Local structure items}
\ikwd{let\@\texttt{type}}
(Introduced in OCaml 5.5)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be preferable to have this new section, but also remove the previous sections about local open/module/exception, and instead add a note saying "prior to ocaml to ocaml 5.5, only local open/module/exception were accepted".
This way, the manual describes the current syntax, instead of having the shape of the manual mirror the history of the syntax.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I pushed in df96e32 a rework of the manual additions. I didn't remove the "old" subsections because they contain remarks and clarifications which are useful for those special cases, but I put them under a general section explaining the general construct introduced in this PR.

See a preview over at https://nojb.github.io/ocaml-manual-wip/expr.html#ss%3Aexpr-local-items.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good.

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented May 27, 2025

Thanks @v-gb for the review! I will look at your comments in detail a bit later today, but here are some initial responses:

It'd be slightly better to have an example of syntax for each new local structure item, perhaps in testsuite/tests/exotic-syntax. Probably not strictly necessary, but it'd help make sure there's no "harmless" shift/reduce syntax conflict that prevents some construct from actually parsing, which I'm pretty sure I've seen happen before.

Good idea. I will add this.

I suppose it's been discussed in past PRs, but I haven't seen such discussions, so feel free to ignore this: to me, the name Pexp_letitem seems more consistent with the naming convention of the AST. Pexp_struct_item would be expected to have a payload of struct_item.

The precise name was chosen in #13839, but without much discussion. I was also told (offline) that the AST node name was a bit confusing. I am open to renaming it (since this patch has not been released, we can still do it). Pexp_letitem seems OK to me. @gasche: as having bikeshedded about the name of this constructor before; what do you think?

Although expressions are structure items, let print_string "a" in ... is not accepted (it wasn't clear from the description, and I was wondering since let%foo print_string "a" in .. would be better than [%foo e1; e2] or (yuck) print_string "a" ;%foo e2).

Indeed, you are right this was not mentioned in the description. It is an interesting point. Let me think about it a bit more and get back here (I need to check there are no parsing issues if we were to allow it).

@v-gb
Copy link
Copy Markdown
Contributor

v-gb commented May 27, 2025

Although expressions are structure items, let print_string "a" in ... is not accepted (it wasn't clear from the description, and I was wondering since let%foo print_string "a" in .. would be better than [%foo e1; e2] or (yuck) print_string "a" ;%foo e2).

Indeed, you are right this was not mentioned in the description. It is an interesting point. Let me think about it a bit more and get back here (I need to check there are no parsing issues if we were to allow it).

Sorry, I wasn't suggesting to try this, I simply thought this was part of your change until I read the code. I think there's no chance that this can work. Both because accepting both let expr in and let pattern = expr in would conflict hard, and because accepting both let+ a in and let a in with unrelated meanings is weird.

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented May 27, 2025

I think there's no chance that this can work.

Yes, indeed, sorry, I have forgotten some of the details since I first wrote the code :)

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented May 28, 2025

It'd be slightly better to have an example of syntax for each new local structure item, perhaps in testsuite/tests/exotic-syntax. Probably not strictly necessary, but it'd help make sure there's no "harmless" shift/reduce syntax conflict that prevents some construct from actually parsing, which I'm pretty sure I've seen happen before.

Added in 8981bbc.

@nojb nojb force-pushed the generalize_local_structure_items branch from d240d69 to 5c17579 Compare May 30, 2025 08:12
Comment on lines +165 to +178
let () =
let type t = A of int | B in
let _ = [A 42; B] in
let type t = .. in
let type t += A of string in
let _ = A "hello" in
let class c = object method f = 42 end in
let class type ct = object method f : int end in
let class d : ct = object (self) inherit c initializer print_int (self # f) end in
let external f : 'a -> 'a = "%identity" in
let [@@@warning "-unused-var"] in
let v = (42, 12) in
assert (f v == v)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering this feature is also tested in typing-misc/letitem.ml, I think it'd be net simpler to consolidate the tests by moving this section into letitem.ml (also, I'm the one who suggested adding tests here, seeing the rest of the file, they seem a bit out of place).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why github marked this as outdated, it still looks relevant to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the tests in 3889d2e, thanks.

@314eter
Copy link
Copy Markdown
Contributor

314eter commented Jul 27, 2025

There are let expressions for classes and class types too.

class empty = let open M in object end

They only allow let open, and since they were not included in #14009 don't support extension points.

Should they be updated too, or are classes obscure enough to leave them untouched?

@nojb nojb force-pushed the generalize_local_structure_items branch from 5c17579 to bec4ec5 Compare July 28, 2025 05:34
@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented Jul 28, 2025

I suppose it's been discussed in past PRs, but I haven't seen such discussions, so feel free to ignore this: to me, the name Pexp_letitem seems more consistent with the naming convention of the AST. Pexp_struct_item would be expected to have a payload of struct_item.

The precise name was chosen in #13839, but without much discussion. I was also told (offline) that the AST node name was a bit confusing. I am open to renaming it (since this patch has not been released, we can still do it). Pexp_letitem seems OK to me.

I opened #14171 to discuss this point.

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented Jul 28, 2025

Should they be updated too, or are classes obscure enough to leave them untouched?

I wouldn't say that, but let bindings in class expressions have enough peculiarities (see eg #13178) that I think it would be better to treat them in a separate PR.


\subsection{ss:expr-other}{Other}
\subsection{ss:expr-local-items}{Local structure items}
(Introduced in OCaml 5.5)
Copy link
Copy Markdown
Contributor

@v-gb v-gb Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(Introduced in OCaml 5.5)
(General case introduced in OCaml 5.5, only let open, non recursive let module and let exception were supported before)

(might need something around "let open" etc to mark them as keywords).

Otherwise it sounds like the whole section is new.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6a5962f, thanks.

Comment on lines +1280 to +1284
@expr@. This construction is equivalent to:
\begin{center}
@"let" "module" M = "struct" local-definition "let" body "=" expr "end" "in" M.body@
\end{center}
without the overhead of creating the actual module @M@ nor the binding @body@.
Copy link
Copy Markdown
Contributor

@v-gb v-gb Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this equivalence is correct, I was surprised when I read this, because this text is right after the definition of a let item, so it feels like it's saying "equivalently, we could have defined let items in terms of let module this way", which would not define how let module itself works.

Maybe we could say:

This construction has equivalent scoping and evaluation to this module expression:
`struct local-definition let body = expr end`

which is enough to give the intuition about "what does let [@@@ocaml.warning "-4"] in ... do?", without the recursivity.

Copy link
Copy Markdown
Contributor Author

@nojb nojb Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that struct local-definition let body = expr end evaluates to a structure, not the value of expr, so this could also be confusing.

The original explanation in terms of let module had been written when the special case of let module was explained in the manual before let <struct item>, but indeed that is no longer the case, so it doesn't make sense anymore.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with your suggestion anyway, with a slight rewording: 04907dd

Comment on lines +1289 to +1290
\subsubsection{sss:expr-let-type}{Local types}
\ikwd{let\@\texttt{type}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but maybe we should explain the very common items before the niche one, which local types will presumably be.
It feels like the order let open, let module, anything else might be most sensible ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Fixed in 3cf766b

Comment on lines +1292 to +1293
The form @"let" typedef { "and" typedef } "in" expr@ introduces a type (or a
collection of mutually recursive types) in the scope of an expression:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a an optional rec, if we mention recursive types ?

Well, even that is not enough, because further down, you talk about extensible type constructors, which have a different syntax entirely. In fact, it would probably make more sense to merge the extensible type constructors section into the let exception one somehow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a an optional rec, if we mention recursive types ?

Fixed in f127dc0, thanks.

Well, even that is not enough, because further down, you talk about extensible type constructors, which have a different syntax entirely. In fact, it would probably make more sense to merge the extensible type constructors section into the let exception one somehow.

Fixed in e452b38, thanks.

\begin{caml_example}{toplevel}
let gen () = let exception A in A

let () = assert(gen () = gen ());;
Copy link
Copy Markdown
Contributor

@v-gb v-gb Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urh, it seems that github doesn't support commenting on unmodified parts of the code.

Anyway, the first sentence of the let module is now describing a syntax less general than is supported (in fact, the base description is already incorrect, as it claims that let module F(A : B) = ... in ... is not valid, which it is).

This should presumably refer to some kind of module_decl grammar item.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no grammar item for module definitions, so I introduced one (module-definition) and used it here: 9cca85a

Thanks.


\subsubsection*{sss:expr-lazy}{Lazy expressions}
\ikwd{lazy\@\texttt{lazy}}
Extensible types can also be introduced in this way. The example below exhibits
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminology issue: this is demonstrating the creation of an extensible type constructor, not extensible type.
The creation of an extensible type would be let type t = .. in blabla.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, clarified the wording in 9598a0d

Comment on lines +1397 to +1399
\subsubsection*{sss:expr-struct-item}{Local structure items}
\ikwd{let\@\texttt{type}}
(Introduced in OCaml 5.5)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good.

@nojb nojb force-pushed the generalize_local_structure_items branch from df96e32 to cbb5d93 Compare November 3, 2025 05:40
@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 26, 2025

This PR needs an approval before being mergeable. @v-gb, out of curiosity, now that @nojb has taken your review comments into account, would you consider "approving" the PR? (That is, taking responsibility for saying that this is good in its current state and could be merged.)

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented Nov 26, 2025

I am planning to go over the documentation aspects of this PR once more to make sure I took into account all of @v-gb's suggestions, because I have forgotten some of the details in the interim period. But another review pass and/or approval would be appreciated in any case :)

@v-gb
Copy link
Copy Markdown
Contributor

v-gb commented Nov 26, 2025

@gasche I do intend to approve, but not yet: there are still outstanding review comments.

Copy link
Copy Markdown
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the implementation part of this PR, which is merely a change to the parser now that most of the logic has been implement in #13839. It is fine. I am approving now, based on the idea that the tests and the manual will be in a good state thanks to @v-gb's review (once both of you have converged on potential manual changes).

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented Dec 1, 2025

@v-gb I think I addressed all your comments (hopefully I haven't missed any!). Can you take a look whenever convenient and let me know if you are content with the current state of the PR?

The updated manual can be previewed over at https://nojb.github.io/ocaml-manual-wip/expr.html#ss%3Aexpr-local-items.

Thanks again for your thorough review!

Copy link
Copy Markdown
Contributor

@v-gb v-gb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, modulo the PR I made against this PR.

@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented Dec 2, 2025

Looks good to me, modulo the PR I made against this PR.

Thanks for the PR, I checked that the manual builds OK and merged it.

Since @gasche has already approved as well, am planning to merge this PR once CI passes. Thanks a lot for your help along the way!

@nojb nojb merged commit 1010272 into ocaml:trunk Dec 2, 2025
24 checks passed
@nojb nojb deleted the generalize_local_structure_items branch December 2, 2025 04:18
@nojb
Copy link
Copy Markdown
Contributor Author

nojb commented Dec 2, 2025

Merged. Thanks everyone!

@nojb nojb changed the title Allow mostly arbitrary structure items in let expressions Allow arbitrary structure items in let expressions Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants