Allow the user to shadow sub-modules of Pervasives#1513
Allow the user to shadow sub-modules of Pervasives#15131 commit merged intotrunkfrom unknown repository
Conversation
There was a problem hiding this comment.
I read the code and I think the patch is safe and correct. Two small points (the second one was raised in #1010):
- A similar change to
initial_envneeds to be done inocamldoc/odoc_analyse.ml. - This patch introduces a small inconsistency in the handling of case-sensitivity of
.cmifilenames. In case-insensitive filesystems (Windows, Mac) one can find moduleFooinFOO.CMIfor example. However, if we make a fileLARGEFILE.CMIvisible using an-Iflag the modulePervasives.LargeFilewill not be shadowed.
|
Ok. I tried the other approach of filtering the modules to add to the environment using Regarding performances, I tried passing 100 include directories and observed a 0.001s slowdown compared to master, so it's negligible. |
|
This looks good to me. Unless someone objects we could merge this and move onwards to #1010! @gasche? @alainfrisch? |
|
I am nervous about this change, but also I haven't had the time to review it to understand what it does, and I am not sure that I will have this time today or in the next few days. I would rather wait before inclusion. (I am actually also a bit nervous about #1010, but so far I have been happy to let other people take decisions for this. Honestly, for this kind of delicate changes with long-term impact, I would rather see a somewhat more formal process than an undirected GPR discussion.) |
|
I would like to see a specification of what this change does in slightly higher-level terms than its implementation. My mental model of the current module lookup semantics in OCaml is that include paths are passed, in some order, to the compiler, and that the initial module-name environment of a program refers, for each possible module name, to the "first" compilation unit with a compatible name in include paths, for a notion of "first" defined by a fixed priority order. The standard library and the current working directory are implicitly added as include directories in some defined position. My understanding of the current semantics of Pervasives and Assuming that this mental model of the current semantics is more or less correct, what is the intended semantics, in these terms, after this patch is applied? Is it documented somewhere that the user can read about? |
|
(As a meta comment: I am surprised that a patch in this design space would make the behavior more correct by adding more code. If we are looking for consistency and predictability, adding more logic is a bit dangerous, and I would rather expect a patch to make things right by removing code, or at most by moving existing code around.) |
|
I agree this shouldn't be discussed just on github. #1010 was discussed and accepted during a developer meeting, but we can discuss it more if doubts remain. Regarding the current semantic, this is basically what you say except that the current directory is not implicitly added. The stdlib directory is added first and so is always considered after all the user supplied include directories when looking up a module.
Yes
After the patch is applied, user supplied include directories have precedence over sub-modules of Here is one way to formalize this a bit more:
Assuming the user has passed After this patch, the initial environment is built as follow: Note that the order in which the So what this patch provides is the ability to shadow sub-modules of
I agree this is a bit surprising. A much more straightforward approach would be to implement It's possible that this change is worth doing though, and it's likely that namespaces will require it, but it'd require more discussion. |
|
I just want to say that this explanation is very good and should definitely be added to the documentation. :) |
|
Another way to see it would be a variant of Would let log x = print_endline x; x
....
open TheLib
log (foo ++ bar) ** bazThen the next version of the library adds a toplevel I'm not arguing in favor of adding |
|
@diml: hanks for the explanation! (I didn't mean to delay any decision taken for #1010, and I know that it's a lot of painful grunt work for you -- I'll stick to discussing the current PR.) Your explanation raises a question: when |
|
@Drup I'm happy to add this explanation to the doc, do you have an idea where it should go? @alainfrisch I guess @gasche It's well possible that this shadowing problem was overlooked in the initial discussion about
but it's not perfect to replace the stdlib because of this shadowing problem. So even with #1010 and this PR, more work needs to be done to be able to really replace the stdlib by something else. But at least we can add new modules to the stdlib. |
|
@alainfrisch I don't think there's much value in adding open*. Its utility disappears as soon as we modify your example slightly: If TheLib adds log, we have the same problem. The only real solution is to have an easy way to specify exactly what you're importing. There's no easy way to do that in OCaml -- you have to create another module: It would be nice if we had more convenient syntax for this: For convenience of reuse, it would also be nice if we could have syntax to create modules easily: As a final thought, modules are OCaml's 'currency of trade'. The easier we make it create and manipulate them, the better the language becomes. |
|
I think the value for
I wrote explicitly that I'm not suggesting to add To be clear: I'm in favor of this semantics (but I did not review the code). |
typing/envaux.ml
Outdated
|
|
||
| exception Error of error | ||
|
|
||
| module StringSet = Set.Make(String) |
There was a problem hiding this comment.
You can reuse Misc.StringSet.
typing/env.mli
Outdated
| | Env_class of summary * Ident.t * class_declaration | ||
| | Env_cltype of summary * Ident.t * class_type_declaration | ||
| | Env_open of summary * Path.t | ||
| | Env_open of summary * Set.Make(String).t * Path.t |
There was a problem hiding this comment.
Also: Misc.StringSet.t.
typing/env.ml
Outdated
|
|
||
| module StringSet = | ||
| Set.Make(struct type t = string let compare = String.compare end) | ||
| module StringSet = Set.Make(String) |
There was a problem hiding this comment.
Line can be removed (Misc is opened).
The current Pervasives defines one only module. The proposal in #1010 is that Pervasives (or Stdlib) would export as many modules as they are currently in the stdlib (about 50). In the OPAM world, it's not uncommon to compile with 30 -I directories. So that would be 15x worst than your benchmark. It's still fine if your timing is representative of the reality, but it might be worth checking under Windows where syscall might be much slower. |
|
I checked under Windows and the timing difference seemed to be of a similar order. |
|
Hmm, actually I just tried on Linux and the slow-down is noticeable. I'm going to try to rebase #1010 on top of this PR and try on a real build, because in practice we do some of this lookup anyway for stdlib modules. Here we just do it eagerly for all stdlib modules. |
|
So I did the following experiment: I created a big jbuilder workspace including jenga and all its transitive dependencies. Then I built the jenga binary with I did the build with both trunk and #1010 rebased on top of #1513. I did the build on Linux. These are the results I got:
So the build is about 1% slower. Implementing what I describe in #1010 (comment) would solve the performance problem, but it is a bigger change and it is a breaking change on case-sensitive file systems, so it is likely to take time to develop and release. Personally, I'm in favor of accepting the small slowdown for now and then looking at another implementation later. Does that sound good? BTW, the build of Base is broken by #1010. We do complicated things in Base to shadow the stdlib (we inspect stdlib.cma and pervasives.cmi) so it's not surprising. |
|
(For the record, I'm satisfied with Jérémie's explanation. I haven't reviewed the code, so I don't approve the PR, but in general I trust his patches so no objection here -- although I think that having some documentation of the whole thing would be nice.) @nojb and @alainfrisch, I'll let you drive this PR; any remaining concerns on performance or design, or merge decisions, are yours to express :-) |
|
Thanks. BTW, I won't be able to provide support over the chrismas period, so in any case I won't merge this myself before next year. |
driver/compmisc.ml
Outdated
| if !Clflags.nopervasives then initial else | ||
| open_implicit_module "Pervasives" initial | ||
| if !Clflags.nopervasives then initial else begin | ||
| let include_dirs = !Clflags.include_dirs in |
There was a problem hiding this comment.
Shouldn't you rather use Config.load_path (assuming it is already initialized from include_dirs at this point)? include_dirs does not necessarily include the directory of the compiled unit, which is automatically added to load_path (in Compmisc.init_path).
Changes
Outdated
| when configured with -no-flat-float-array | ||
| (Jeremy Yallop, report by Gabriel Scherer) | ||
|
|
||
| - GPR#1513: Allow include directories to shadow sub-modules of the |
There was a problem hiding this comment.
I'm not sure this explanation is very clear for the users (esp. if you include the current directory, which is part of the load path but not technically an "include directory"). What about "Compilation units are allowed to shadow sub-modules of Pervasives"?
There was a problem hiding this comment.
Ok, that looks better. I'll update the text
|
|
I'll add a test. BTW, I realized there is something untrue about the new semantic I wrote before; because we look for i.e. the include directories are added twice: once before opening |
|
I added a test and addressed the review comments. |
|
Ok, the implementation, while quite short, is quite a bit more subtle than I expected. It implements the semantics outlined by @diml (at least, as far as I can tell) but in a fairly different way. Basically, it modifies the One small thing that I do not like is that the only filtering list which matters (i.e., the one used to open "Pervasives": all the others are empty) is defined in So, I would propose to add two functions, Additionally, given how thoroughly you take care of propagating Apart from that, it looks ok to me. |
Yh, it's the only I can think of to avoid a breaking change on Windows due to the file system.
Seems fine. Looking at this again, we can make things a bit more explicit by replacing type where_to_open = Before_load_path | After_load_pathAnd for the
Making it required seems fine. It's a breaking change, but I don't think we care too much here (?) |
|
I refactored the code a bit. In the end I didn't expose I also added |
|
Great, thanks all for your reviews and comments. I will rebase and merge this today and then i'll restart the work on #1010 |
- this is a port of ocaml/ocaml#1513 - This patch has been in OCaml since 4.07, but it hadn't been in melange. - This increases compatibility with upstream OCaml even further
- this is a port of ocaml/ocaml#1513 - This patch has been in OCaml since 4.07, but it hadn't been in melange. - This increases compatibility with upstream OCaml even further
* Announce for the second release candidate of OCaml 5.1.0 * Name fix * Various fixes * One minor spelling/formatting change change log to changelog * Fix format --------- Co-authored-by: Christine Rose <christinerose@users.noreply.github.com>
Before this PR:
After this PR:
Sub-modules of
Pervasivesalways shadow modules coming from the include directories, so users cannot redefine them. This PR changes that by filtering the modules that are added to the environment when openingPervasives.This is especially important for #1010, since the main goal of #1010 is to allow to add new modules to the standard library.