Skip to content

WIP: Hide bound variables marked with [@ocaml.private] from signatures#682

Closed
alainfrisch wants to merge 1 commit intoocaml:trunkfrom
alainfrisch:private_decl
Closed

WIP: Hide bound variables marked with [@ocaml.private] from signatures#682
alainfrisch wants to merge 1 commit intoocaml:trunkfrom
alainfrisch:private_decl

Conversation

@alainfrisch
Copy link
Copy Markdown
Contributor

Not ready for merge!

This is a first proposal to address http://caml.inria.fr/mantis/view.php?id=5764 by providing a way to specify that some non-local let-bound pattern variables should not be exposed in their structure. This is done by detecting a special attribute on pattern variables (the lhs is used in case of or-patterns):
For instance in

  module M = struct
      let x[@ocaml.private], y = 1, 2
  end

outside M, only y exists.

This is not really intended for end-users and more for ppx or other code generators that need to insert declarations for internal uses which should not be exposed in inferred signatures. (See LexiFi/landmarks#2 for a recent case where such a feature would have been useful.)

This PR is to discuss the general idea.

The current implementation is not very satisfactory, because it parses the attribute potentially multiple times. A better approach would probably to add to the Tstr_value constructor an explicit list of "public" identifiers, rather than recover it from analyzing its patterns. The logic to parse the attribute would then be nicely located in Typemod.

What to do in the toplevel is not completely clear. Should private identifiers be usable by later phrases?

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Jul 11, 2016

I'm generally in favour of this kind of feature as it is a reasonably frequent request from (mostly internal) users. I would actually prefer it be made available for general use by giving it a proper syntax. I had in mind something like:

private let ... = ...

I was also thinking in terms of a general feature on all definition-style structure items (so not open and include), which also partially explains my suggested syntax.

Of course this syntax doesn't allow a direct encoding of your example, where only one variable in the pattern is made private. But of course you could always do:

private let x, y = ...
let y = y

which I assume would be just as easy for ppx expanders to generate.

I also wouldn't be against using an attribute-based syntax for now to get a feel for the feature first, since it seems relatively easy to deprecate the attribute-based syntax in the future.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Jul 11, 2016

As for the toplevel I suspect the most consistent behaviour is probably to allow private identifiers to be used in later phases. I tend to think of the top-level as a slight variation of defining a giant module but with some relaxed rules for naming, and so that would the behaviour I would expect.

@garrigue
Copy link
Copy Markdown
Contributor

For the record, there were private instance variables in OCaml 1.0x, with the behavior described above.
They were removed in OCaml 2.00, because they somehow didn't fit the global model of the language. In particular, as I understand it, Xavier was opposed to non-coercion based restriction on scope: if you want to remove something, write the whole interface.

I agree with Leo that if you want to make it a feature of the language, then it had better have a proper syntax. I would rather spell it let private than private let, but this is cosmetic.
This is a real extension to the language, and it should be discussed as such.

However, I'm not sure I see the connection with the bugs you describe.
To me it looks more like an hygiene problem: if you insert some code in multiple parts of a file, you want the names in the code to be non overridable. So the problem is not that the variable was overridden by error, but that it was overridable to start with. The bug is in bisect, not in the program it is used on.

For signatures alone, it should be sufficient to have mechanism that avoids printing some declarations when using -i, as suggested in the original PR. This could indeed be done via an attribute.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

However, I'm not sure I see the connection with the bugs you describe.
To me it looks more like an hygiene problem: if you insert some code in multiple parts of a file

The bugs are related to instrumentation tools (Bisect, Landmarks) that need to bind one identifier for the scope of the current unit. Basically they add a declaration:

let __foo__internal__ = ...

on top of each compilation unit, and then rewrite its body to insert reference to this identifier. This goes well until people write:

open MyUnit
...

where MyUnit does not have an .mli file, which has the effect of bringing the identifier __foo__internal__ from MyUnit in the current scope, shadowing the local __foo__internal__.

Yes, this is a problem with these tools, not on programs using them. But what is the proper solution? In both cases, we change the name of the synthesized identifier to contain a unique enough stamp. I believe that taking the digest of (unit name, structure_ast, !Clflags.for_package) is a good stamp, but this really feels like a hack having to do that. Moreover, even if this solves the hygiene problem, it is not satisfactory that the tools change the signature of .mli-less units.

So the problem is not that the variable was overridden by error, but that it was overridable to start with.

Probably, but how to make an identifier non overridable?

Actually, this might suggest an alternative to this PR: since the problem is related to open bringing too many values in scope, one could use an attribute on value declarations to keep them in inferred signatures but not import them when the module is opened. (This seems even a bit more intrusive than the current PR, though.)

For signatures alone, it should be sufficient to have mechanism that avoids printing some declarations when using -i, as suggested in the original PR. This could indeed be done via an attribute.

At least two problems:

  • The usability of these instrumentation tools depends how easy it is to plug them on existing (usually large) projects. This can be achieved mostly with OCAMLPARAM (injecting a compile-time ppx and a link-time support library). If we tell users that they must generate missing .mli files with ocamlc -i, this becomes less trivial. This step cannot be done before starting to compile the project; you need to build all dependencies first. So it's hard to automate that without heavy surgery on the project's build system.
  • ocamlc -i is not a robust solution: it is not guaranteed to produce well-formed signatures.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

prefer it be made available for general use by giving it a proper syntax

Fair enough, I'm generally the one advocating against the use of attributes for features that impact type-checking (@immediate and co).

I'm only a bit concerned by changing the Pstr_value constructor in the Parsetree (thus breaking syntactic tools). If we do such a change, one should probably either change the current rec_flag into a record type for later extensibility; or switch the constructor to using an inlined record (also for its value_bindings).

Note that it would even simpler to support an entire group of let bindings to be made private rather than a per-variable choice as in the PR.

As for the toplevel I suspect the most consistent behaviour is probably to allow private identifiers to be used in later phases.

Meaning that the private marker would be ignored altogether, or would there still be some (less obvious) impact?

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 12, 2016

Note that it would even simpler to support an entire group of let bindings to be made private rather than a per-variable choice as in the PR.

SML has an even more general approach here: you can write

local
  decs₁
in
  decs₂
end

to make the declarations decs₁ visible only in decs₂, but not elsewhere. You can have multiple let declarations, exceptions, type definitions, etc., in decs₁, and they are both excluded from the module interface and hidden from any code that follows decs₂ in the module definition.

Adding local would solve the bisect problem described in this PR. But it would also address some other longstanding issues in OCaml. For example, it would often be convenient to define a group of mutually-recursive functions f₁, f₂, ..., that all depend on some local definitions g₁, g₂, ..., . With local one could write it like this:

local
  let g₁ x = e₁
  let g₂ x = e₂
in
  let rec fx = e₃
      and f₂ x = e₄
end

It's possible to achieve the same thing in current OCaml using an awkward tuple binding for f₁, f₂, etc. But I don't know of an elegant way to write it. So I often miss local when writing OCaml code.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

I was also thinking in terms of a general feature on all definition-style structure items

Allowing to hide e.g. type definitions would be useful, but it is more complex than hiding value definitions, since one must ensure that the types can be erased without breaking the well-formedness of the residual signature (giving an "would escape its scope" error otherwise). Do you know if this can be handled easily with the existing level-based scoping machinery in the type-checker?

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Jul 12, 2016

Couldn't

local
  decs1
in
  decs2
end

be translated into something like:

include (struct decs1 decs2 end : sig include module type of (struct decs2 end) end)

which would handle all scoping issues automatically ?

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 12, 2016

@nojb: decs1 would need to be in scope in the ascription.

However, something like this might work:

include ((functor (X: module type of struct decs1 end) ->
                 struct open X decs2 end)
                (struct decs1 end))

(except, as usual, this can't be a purely syntactic transformation because it needs a fresh name, and resolving scopes needs type information)

@xavierleroy
Copy link
Copy Markdown
Contributor

@alainfrisch : Concerning the hiding of type (and module) definitions, that's a question that already arises when typing functor applications (see examples below). The Mtype.nondep_supertype function is the basis of the solution. The solution is sound but not complete. For more details, see section 5.5 of my "Modular module system" paper (http://gallium.inria.fr/~xleroy/bibrefs/Leroy-modular-modules.html).

Example: the effect of functor G is similar to making the type t local in the X structure.

# module G(X: sig type t val x:t end) = struct let x = X.x end;;
module G : functor (X : sig type t val x : t end) -> sig val x : X.t end

Sometimes it can be done, other times it cannot:

# module B = G(struct type t = int let x = 1 end);;
module B : sig val x : int end
# module C = G(struct type t = X|Y let x = X end);;
Error: This functor has type
       functor (X : sig type t val x : t end) -> sig val x : X.t end
       The parameter cannot be eliminated in the result type.
        Please bind the argument to a module identifier.

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 12, 2016

A possible alternative to adding private to structure items is to extend open to take a module expression rather than a longident. Then

let private x = e₁
let private y = e₂

could be written like this

open struct
  let x = e₁
  let y = e₂ 
end

and

local
  decs₁
in
  decs₂
end

could be written like this

include struct
  open struct decs₁ end
  decs₂
end

The treatment of module and type identifiers would be the same as for uneliminable functor arguments.

Are there any obstacles to such an extension?

@alainfrisch
Copy link
Copy Markdown
Contributor Author

include struct
  open struct decs₁ end
  decs₂
end

What's the use of the include struct ... end wrapper?

@yallop
Copy link
Copy Markdown
Member

yallop commented Jul 12, 2016

What's the use of the include struct ... end wrapper?

Without it, decs₁ would be in scope in the code after decs₂.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Jul 12, 2016

Meaning that the private marker would be ignored altogether, or would there still be some (less obvious) impact?

Yeah it would be ignored in the toplevel.

Do you know if this can be handled easily with the existing level-based scoping machinery in the type-checker?

As Xavier said, I would expect nondep_supertype to be the solution to that issue.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Jul 12, 2016

be translated into something like:

include (struct decs1 decs2 end : sig include module type of (struct decs2 end) end)

It is worth noting that, thanks to the somewhat broken nature of module type of, this will not always produce the right thing when module aliases are involved.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Without it, decs₁ would be in scope in the code after decs₂.

Ok, that's for emulating local. For private, this is not needed:

let private x = ....

would simply be written instead:

open struct let x = ... end

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Jul 12, 2016

I don't mind open struct ... end, but I'm not keen on open Foo(Bar) or open (M : S). So I would suggest extending open to support struct ... end rather than extending it to accept all module expressions.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Jul 12, 2016

(I also would still like to have private let, private type, etc. but I think open struct ... end is a less invasive change and doesn't prevent supporting private later if open struct ... end proves very popular.)

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Yet another alternative would be to support only module private X = ..., which allows encoding open struct ... end as module private X = ... ;; open X (with the drawback of having to invent a module name).

@alainfrisch
Copy link
Copy Markdown
Contributor Author

The Mtype.nondep_supertype function is the basis of the solution.

Thanks @xavierleroy. I've created a branch to experiment with private module declaration (and opened #683).

@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Jul 12, 2016

The idea of having a shared local scope between global functions is intriguing. I've often wished for such a thing, when some local functions become useful to more than one global function and I'd rather not make the local function global. It's also the best way to use global mutable state -- by limiting its scope to only a few select functions. It'd be nice if the syntax for this wasn't convoluted.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

In particular, as I understand it, Xavier was opposed to non-coercion based restriction on scope: if you want to remove something, write the whole interface.

@xavierleroy Do you stand by this position? If so, I guess this would exclude more general proposals by @yallop and @lpw25. Would the softer original proposal (or #683), based on attribute (no proper syntax) and advertised only as an extension intended for ppx developers (not an official language feature), be acceptable?

@bobzhang
Copy link
Copy Markdown
Member

@alainfrisch I like module private X = .. , it is safer than open struct .. end

@bluddy
Copy link
Copy Markdown
Contributor

bluddy commented Sep 6, 2017

I like both private and open struct...end. Can we get back to discussing this?

Is the only way this can be integrated into the language if it survives a developers' meeting before a new version? That doesn't seem fair to small but very useful changes such as this, which don't make the cut for the agenda.

Why not expand the language with these useful tools? Or even better -- insert them as part of the 'beta features' section of the language (something I've suggested before) which aren't guaranteed full future support. If they take off, we can keep them.

For the record, I recommend placing all new features in a 'beta features' section, so we don't bind ourselves for all of eternity to features that don't work out or remain unused, and so that we don't remain overly conservative about accepting new ideas.

@alainfrisch
Copy link
Copy Markdown
Contributor Author

Subsumed by @objmagic and @yallop work on open <module_expr>, see #683.

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Sep 21, 2022
Remove set_of_closures_symbol_ref etc.
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
…ges (ocaml#682)

Co-authored-by: Sabine Schmaltz <sabine@tarides.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants