RFC: add (strict_includes) in (library) and (executable)#430
RFC: add (strict_includes) in (library) and (executable)#430hnrgrgr wants to merge 3 commits intoocaml:masterfrom
(strict_includes) in (library) and (executable)#430Conversation
|
Thanks, that seems like a reasonable feature. A few remarks before looking at the code:
(library
((name a)
(modules (a))))
(library
((name b)
(modules (b))
(libraries (a))))
(library
((name c)
(modules (c))
(libraries (b))))
(library
((name d)
(modules (d))
(strict_includes)
(libraries (c))))? Given that all the cmi will end up in the same directory, I assume no, which is not good. #427 would solve this issue, would you consider implementing it? :) |
8bf0d42 to
34b5df1
Compare
Agreed.
You are right. I will have a look into #427, but probably not before the end of the week. |
c9d3072 to
ee0d33c
Compare
|
I found some time earlier than expected for prototyping #427, I would appreciate your feedback. The current patch only moves the OCaml objects ( Regarding the 'strict' inclusion of library, I tried to gave it more thoughs and the disjunction between |
25c5994 to
4ec65b9
Compare
Cool, thanks. I'd rather not rely on symlinks as on Windows this will cause one more file copy. Does it work if we just change the compiler invocation to:
That should be fine.
So, how does it work, do libraries essentially have two sets of dependencies? API ones and link time ones? Note that this is actually what we do inside Jane Street, however we do it automatically: the API dependencies are computed by reading the output of ocamlobjinfo. Though this requires being able to associate modules to libraries. |
I think this would be a sane default: you can only use symbols in the libraries defined in the |
That's what the patch does, it currently invokes the compiler the way you proposed. The symlink is the other way round (e.g. |
Sort of, the API one being included in the link time one. In term of implementation, I just added a boolean in type t =
| Internal of Internal.t * bool
| External of Findlib.package * bool
That makes sense, and would simplify the implementation. |
If the behavior you propose is to only see modules that are in libraries defined in the |
I am not sure to understand. In both cases you can define new module aliases, and that should just work, right? |
If you have |
Ah, I see. Sorry, I didn't read the patch carefully enough. That makes sense. |
|
I'm a bit lost in all the discussions, can someone post a jbuild file defining several libraries with some use of the new syntax and explain what libraries are expected to be seen by each library? |
|
Here is the current semantics and syntax: |
|
OK, thanks. BTW are there cases where we want the strict behavior only for a subset of the dependencies listed in the About changing the default, you can try but I'm pretty sure that this will break tons of stuff. |
|
Wait, in your example, shouldn't |
I am not sure. The patch was simpler this way and I am not against using Thanks for your feedbacks. |
It does, I fixed the comment. |
|
OK, so in the new world, is writing |
|
For instance, I was trying to think of this feature as:
but that still doesn't work for Core. |
|
BTW, I guess Core is not the common case. If it help with real use cases, I'm fine with the feature as it is now. |
|
I thought a bit more about this, it seems to me that:
Based on that I suggest:
Then in a distant future we could try to make the strict behavior the default. What do you think? |
|
Great, thanks! |
|
I was thinking about this again and it seems to me that the best would be a compiler feature to express that we want to reexport a module. i.e. something like this: module X = Foo.X [@@reexport]
include Y [@@reexport]that would effectively inline the cmi contents instead of just creating aliases |
|
@Octachron doesn't that make sense? If a client code is relying on some library to introduce some type equalities (or |
|
I imagine that asking users to do some dependency analysis on their own in those corner cases is sensible, but this is also a warning that there is observable differences between building with and without the transitive closure of dependencies. Moreover, hiding type equalities do not always result in errors: Hiding an equality |
|
Technically, it doesn't have to break the optimization right? I mean, if a type Overall, I feel like they will be corner cases, however we should be able to solve them. Providing this feature and using it will allow us to discover them. |
|
Sure, the cmi representation could be changed to record the full type declaration and hide it, rather than link to the original definition like currently. I rather agree with your analysis, at most it could be nice to document those known corner cases. |
|
Documenting them seems right |
|
@hnrgrgr if you don't have time, we can help out with rebasing this work as well. Just let us know. |
|
@diml I'm wondering if the project itself is actually the correct place to set this mode. I assume that we'd start the transition to this new mode, and eventually, we'd make this the default. In which case this option would be obsolete. At the moment, I'm fairly convinced that the current recursive behavior is useless apart from backwards compatibility concerns. So I think it would be more trouble than its worth to make an option that would eventually removed available in so many places. Extending libraries can be handled with a field like
This concern seems orthogonal to me. Nevertheless, it seems useful to be able to control interface dependencies this way. How about adding an ordered set valued |
|
It would indeed be nice to make it the default in That is how I expect things to go:
Then when we are ready to release
|
|
Alright, that sounds like a good idea. Btw, I think that rebasing this PR is not realistic - far too much has changed. I guess I could try writing this myself if Gregoire is preoccupied. |
|
I also agree with the proposal. I won't be able to work on it before
October. Feel free to rebase/rewrite, if it is more urgent.
|
|
Noted. Just in case we work on this before October, @hnrgrgr could you sign off the current commits and force-push? |
|
I think it's unlikely that I'd get to work on this before the ICFP as well. Feel free to take your time. |
Sometimes, in order to enforce abstractions, we might want to build a library without allowing access to all the `.cmi` of the 'recursive' libraries it depend upon. For instance, we might want to call the OCaml compiler with only the `-I` of the explicitly listed libraries and not the `-I` of their recursives closures. This patch allow this behaviour by introducing an optional `!` as first character of a dependent library. The default behaviour is not changed and obviously this patch preserves the recursive closure of dependent libraries when calling the executable linkers.
4ec65b9 to
c1ee1d3
Compare
|
Sorry for coming to this very late, but I'm not sure this is the right approach to this issue. Basically, there are two kinds of lookups in the type environment:
For example, when you write: type t = Foo.tin a file Transitive dependencies are a bad thing for lookups of the first kind: you only want the user to be able to write However, for lookups of the second kind the compiler always assumes that the transitive closure of dependencies will be available -- and which lookups it does is essentially an implementation detail that the user has no real control over. (A third kind of lookup is looking up a So if you provide the transitive dependencies you get extra names in scope, whereas if you don't you get unexpected errors. With the compiler as it is now the only known workaround for this issue is to have the build system The best solution would be to change how the compiler works. All that is needed is a distinction between include paths that should be available for the both kinds of lookup and include paths that should only be available for the second kind. I think that this would be fairly easy to add now thanks to @diml's recent changes to how the initial environment is represented. |
|
Perhaps it changed from previous version, but @garrigue was advising the hiding of cmi in this mantis ticket https://caml.inria.fr/mantis/view.php?id=6823 . |
|
I revived this in #1745 |
CHANGES: - Second step of the deprecation of jbuilder: the `jbuilder` binary now emits a warning on every startup and both `jbuilder` and `dune` emit warnings when encountering `jbuild` files (ocaml/dune#1752, @diml) - Change the layout of build artifacts inside _build. The new layout enables optimizations that depend on the presence of `.cmx` files of private modules (ocaml/dune#1676, @bobot) - Fix merlin handling of private module visibility (ocaml/dune#1653 @bobot) - unstable-fmt: use boxes to wrap some lists (ocaml/dune#1608, fix ocaml/dune#1153, @emillon, thanks to @rgrinberg) - skip directories when looking up programs in the PATH (ocaml/dune#1628, fixes ocaml/dune#1616, @diml) - Use `lsof` on macOS to implement `--stats` (ocaml/dune#1636, fixes ocaml/dune#1634, @xclerc) - Generate `dune-package` files for every package. These files are installed and read instead of `META` files whenever they are available (ocaml/dune#1329, @rgrinberg) - Fix preprocessing for libraries with `(include_subdirs ..)` (ocaml/dune#1624, fix ocaml/dune#1626, @nojb, @rgrinberg) - Do not generate targets for archive that don't match the `modes` field. (ocaml/dune#1632, fix ocaml/dune#1617, @rgrinberg) - When executing actions, open files lazily and close them as soon as possible in order to reduce the maximum number of file descriptors opened by Dune (ocaml/dune#1635, ocaml/dune#1643, fixes ocaml/dune#1633, @jonludlam, @rgrinberg, @diml) - Reimplement the core of Dune using a new generic memoization system (ocaml/dune#1489, @rudihorn, @diml) - Replace the broken cycle detection algorithm by a state of the art one from [this paper](https://doi.org/10.1145/2756553) (ocaml/dune#1489, @rudihorn) - Get the correct environment node for multi project workspaces (ocaml/dune#1648, @rgrinberg) - Add `dune compute` to call internal memoized functions (ocaml/dune#1528, @rudihorn, @diml) - Add `--trace-file` option to trace dune internals (ocaml/dune#1639, fix ocaml/dune#1180, @emillon) - Add `--no-print-directory` (borrowed from GNU make) to suppress `Entering directory` messages. (ocaml/dune#1668, @dra27) - Remove `--stats` and track fd usage in `--trace-file` (ocaml/dune#1667, @emillon) - Add virtual libraries feature and enable it by default (ocaml/dune#1430 fixes ocaml/dune#921, @rgrinberg) - Fix handling of Control+C in watch mode (ocaml/dune#1678, fixes ocaml/dune#1671, @diml) - Look for jsoo runtime in the same dir as the `js_of_ocaml` binary when the ocamlfind package is not available (ocaml/dune#1467, @nojb) - Make the `seq` package available for OCaml >= 4.07 (ocaml/dune#1714, @rgrinberg) - Add locations to error messages where a rule fails to generate targets and rules that require files outside the build/source directory. (ocaml/dune#1708, fixes ocaml/dune#848, @rgrinberg) - Let `Configurator` handle `sizeof` (in addition to negative numbers). (ocaml/dune#1726, fixes ocaml/dune#1723, @Chris00) - Fix an issue causing menhir generated parsers to fail to build in some cases. The fix is to systematically use `-short-paths` when calling `ocamlc -i` (ocaml/dune#1743, fix ocaml/dune#1504, @diml) - Never raise when printing located errors. The code that would print the location excerpts was prone to raising. (ocaml/dune#1744, fix ocaml/dune#1736, @rgrinberg) - Add a `dune upgrade` command for upgrading jbuilder projects to Dune (ocaml/dune#1749, @diml) - When automatically creating a `dune-project` file, insert the detected name in it (ocaml/dune#1749, @diml) - Add `(implicit_transitive_deps <bool>)` mode to dune projects. When this mode is turned off, transitive dependencies are not accessible. Only listed dependencies are directly accessible. (ocaml/dune#1734, ocaml/dune#430, @rgrinberg, @hnrgrgr) - Add `toplevel` stanza. This stanza is used to define toplevels with libraries already preloaded. (ocaml/dune#1713, @rgrinberg) - Generate `.merlin` files that account for normal preprocessors defined using a subset of the `action` language. (ocaml/dune#1768, @rgrinberg) - Emit `(orig_src_dir <path>)` metadata in `dune-package` for dune packages built with `--store-orig-source-dir` command line flag (also controlled by `DUNE_STORE_ORIG_SOURCE_DIR` env variable). This is later used to generate `.merlin` with `S`-directives pointed to original source locations and thus allowing merlin to see those. (ocaml/dune#1750, @andreypopp) - Improve the behavior of `dune promote` when the files to be promoted have been deleted. (ocaml/dune#1775, fixes ocaml/dune#1772, @diml) - unstable-fmt: preserve comments (ocaml/dune#1766, @emillon) - Pass flags correctly when using `staged_pps` (ocaml/dune#1779, fixes ocaml/dune#1774, @diml) - Fix an issue with the use of `(mode promote)` in the menhir stanza. It was previously causing intermediate *mock* files to be promoted (ocaml/dune#1783, fixes ocaml/dune#1781, @diml) - unstable-fmt: ignore files using OCaml syntax (ocaml/dune#1784, @emillon) - Configurator: Add `which` function to replace the `which` command line utility in a cross platform way. (ocaml/dune#1773, fixes ocaml/dune#1705, @Chris00) - Make configurator append paths to `$PKG_CONFIG_PATH` on macOS. Previously it was prepending paths and thus `$PKG_CONFIG_PATH` set by users could have been overridden by homebrew installed libraries (ocaml/dune#1785, @andreypopp) - Disallow c/cxx sources that share an object file in the same stubs archive. This means that `foo.c` and `foo.cpp` can no longer exist in the same library. (ocaml/dune#1788, @rgrinberg) - Forbid use of `%{targets}` (or `${@}` in jbuild files) inside preprocessing actions (ocaml/dune#1812, fixes ocaml/dune#1811, @diml) - Add `DUNE_PROFILE` environment variable to easily set the profile. (ocaml/dune#1806, @rgrinberg) - Deprecate the undocumented `(no_keep_locs)` field. It was only necessary until virtual libraries were supported (ocaml/dune#1822, fix ocaml/dune#1816, @diml) - Rename `unstable-fmt` to `format-dune-file` and remove its `--inplace` option. (ocaml/dune#1821, @emillon). - Autoformatting: `(using fmt 1.1)` will also format dune files (ocaml/dune#1821, @emillon). - Autoformatting: record dependencies on `.ocamlformat-ignore` files (ocaml/dune#1824, fixes ocaml/dune#1793, @emillon)
Summary: Before D38213812 if a target *t* depended on a library *a* where *a* depends on a library *b*, then the modules of *b* were accessible to *t*. In D38213812 we changed this behavior so that in order to use the modules of *b* from *t*, then *b* must be in *t*'s `deps` (that is, a "direct" dependency). Why? williewillus [noted](https://fb.workplace.com/groups/403065016408786/permalink/8781197225262148/) buck2 differed from buck1 in this respect, admitting transitive use where buck1 did not. I argued then that we wrote the buck2 build rules that way with the understanding we were overcoming a buck1 shortcoming. However opinion prevailed that the buck1 behavior was "better". For example, mroch argued "I was able to depend on `String_utils` in `Server_files_js`, because `Server_files_js` depends on `Sys_utils` and `Sys_utils` depends on `String_utils`. That's an implementation detail! I should have to explicitly depend on things I use directly, not get them for free because something else used them transitively". We found cause to reconsider this restriction. Consider this program: ``` let (): unit = Core.Printf.printf "Hello world!" ``` The dune file for the above program would look something like this: ``` (executable (name core-test) (libraries core) ) ``` When we translate that into a buck2 target we get the following. ``` ocaml_binary( name = "core-test", srcs = ["core_test.ml"], external_deps = [ ("supercaml", None, "core"), ], ) ``` Before this diff (in the presence of the no transitive deps rule) building this target results in an error: ``` 1 | let (): unit = Core.Printf.printf "Hello world!" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Error: This expression has type Core__.Import.unit = Base.Unit.t but an expression was expected of type unit ``` The issue is that for this to work there needs to be a `-I` for `Base` on the compile line and we deliberately prune it out! So, this is a fix ``` ocaml_binary( name = "core-test", srcs = ["core_test.ml"], external_deps = [ ("supercaml", None, "core"), ("supercaml", None, "base"), ], ) ``` but it's not satisfying and potentially hard to discover. If on the other hand we relax the no transitive deps rule, the build "just works" as it does with dune. Given this example, as samwgoldman put it, adding `-I`'s for transitive dependencies is the "lesser of two evils". *Note: There are ongoing discussions in the community about compiler support for solving the problems we are grappling with here. See e.g. ocaml/dune#430 (comment) and ocaml/RFCs#31 Reviewed By: samwgoldman Differential Revision: D43504245 fbshipit-source-id: ccb70c4b6c8e700287e1c5eb8eff0677a8b7f855
Summary: Before D38213812 (d872b98) if a target *t* depended on a library *a* where *a* depends on a library *b*, then the modules of *b* were accessible to *t*. In D38213812 (d872b98) we changed this behavior so that in order to use the modules of *b* from *t*, then *b* must be in *t*'s `deps` (that is, a "direct" dependency). Why? williewillus [noted](https://fb.workplace.com/groups/403065016408786/permalink/8781197225262148/) buck2 differed from buck1 in this respect, admitting transitive use where buck1 did not. I argued then that we wrote the buck2 build rules that way with the understanding we were overcoming a buck1 shortcoming. However opinion prevailed that the buck1 behavior was "better". For example, mroch argued "I was able to depend on `String_utils` in `Server_files_js`, because `Server_files_js` depends on `Sys_utils` and `Sys_utils` depends on `String_utils`. That's an implementation detail! I should have to explicitly depend on things I use directly, not get them for free because something else used them transitively". We found cause to reconsider this restriction. Consider this program: ``` let (): unit = Core.Printf.printf "Hello world!" ``` The dune file for the above program would look something like this: ``` (executable (name core-test) (libraries core) ) ``` When we translate that into a buck2 target we get the following. ``` ocaml_binary( name = "core-test", srcs = ["core_test.ml"], external_deps = [ ("supercaml", None, "core"), ], ) ``` Before this diff (in the presence of the no transitive deps rule) building this target results in an error: ``` 1 | let (): unit = Core.Printf.printf "Hello world!" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Error: This expression has type Core__.Import.unit = Base.Unit.t but an expression was expected of type unit ``` The issue is that for this to work there needs to be a `-I` for `Base` on the compile line and we deliberately prune it out! So, this is a fix ``` ocaml_binary( name = "core-test", srcs = ["core_test.ml"], external_deps = [ ("supercaml", None, "core"), ("supercaml", None, "base"), ], ) ``` but it's not satisfying and potentially hard to discover. If on the other hand we relax the no transitive deps rule, the build "just works" as it does with dune. Given this example, as samwgoldman put it, adding `-I`'s for transitive dependencies is the "lesser of two evils". *Note: There are ongoing discussions in the community about compiler support for solving the problems we are grappling with here. See e.g. ocaml/dune#430 (comment) and ocaml/RFCs#31 Reviewed By: samwgoldman Differential Revision: D43504245 fbshipit-source-id: ccb70c4b6c8e700287e1c5eb8eff0677a8b7f855
Sometimes, in order to enforce abstractions, we might want to build a library without allowing access to all the
.cmiof the "recursive" libraries it depends upon. For instance, we might want to call the OCaml compiler with only the-Iof the explicitly listed libraries and not the-Iof their recursives closures.This patch allows this behaviour by introducing an optional
(strict_includes)stanza in(library)and(executables).The default behaviour is not changed and obviously this patch preserves the recursive closure of dependent libraries when calling the executable linkers.