Skip to content

Ocamldep modalias#286

Merged
garrigue merged 2 commits intoocaml:trunkfrom
garrigue:ocamldep-modalias
Nov 30, 2015
Merged

Ocamldep modalias#286
garrigue merged 2 commits intoocaml:trunkfrom
garrigue:ocamldep-modalias

Conversation

@garrigue
Copy link
Copy Markdown
Contributor

(This description was edited. Some comments may refer to parts that have disappeared, such as support for prefixing through -oprefix / -no-prefix)

This pull request adds module alias support to ocamldep.
The idea is not necessarily to support everything, but to allow users to build their libraries using -no-alias-deps without too much hassle.

There are 2 new options added to ocamldep:

  • -map name.{ml,mli}

    Parse name.ml or name.mli and build a dependency map used by other files, if they open the module Name.
  • -no-alias-deps

    Match the behavior of ocamlc's -no-alias-deps, i.e. do not add dependencies for module aliases when they are not accessed. The behavior should be conservative for interface files (all dependencies are detected), but may miss dependencies for implementation files if the interface coerces some of the exported module aliases.This is only useful if the file contains module aliases. In practice, using it only if you want to compute the dependencies of a map file seems sufficient.

There is no support for prefixing: one is supposed to have the source files name match the compiled version, using copying or symbolic links.

There is a small example in testsuite/tests/no-alias-deps2.
In particular, you can see there that a standard way to use it would be

ocamldep -map lib.ml -open Lib LibA.ml LibB.ml LibC.ml

This means that the bindings in the file lib.ml are to be used as a map, and this map is enabled in all files by -open Lib (otherwise one would have to add open Lib manually to each file where it is needed).
Note that the map file is not a special file. For instance in the above example, the contents are:

module Packed = struct
  module A = LibA
  module B = LibB
  module C = LibC
end
include Packed

Which demonstrates than include works properly.

Of course, one can alternatively write a lib.mli, where the contents would be:

module Packed : sig
  ...
end
include (module type of Packed)

If you have both a lib.ml and a lib.mli, you may want to generate dependencies for them. In that case, you should write:

ocamldep -no-alias-deps lib.ml lib.mli

which avoids creating a circular dependency for the cmx's.

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Nov 10, 2015

It is great that you tackle this problem (I'm currently trying to convert Frama-C from pack to module-alias) and I appreciate the clean-up in depend.mli. Questions:

  1. Why do you choose -map to take an .ml file instead of a .cmi as argument? If it is a map used for compiling the library it must be compilable before them and with .cmi we are sure that all the ocaml construct are taken into account.

  2. The main design problem is about -oprefix

    Indeed if your applications have more than one library you need dependencies to the other Lib2A.cm* Lib2B.cm*. So the current -oprefix doesn't work . The other possibility ocamldep -with-prefix Lib A.ml B.ml C.ml -end-prefix seems not too hard to automatically create. Could -noprefix disappear if -with-prefix is used ?

  3. What do you think of ocamldep -paths #146 ? It seems to me that your solution is simpler to use at least in Makefile.

  4. Do you have ideas of what should be done in ocamlbuild?

@lefessan
Copy link
Copy Markdown
Contributor

I am a bit worried by all this. There were interesting proposals to add namespaces in the language, that have been rejected in favor of the "minor modifications" required by module aliases to have something equivalent to namespaces. Now, you propose to add 4 different options to ocamldep. For me, it questions the "simplificity" of module aliases as a replacement for namespaces.

@garrigue
Copy link
Copy Markdown
Contributor Author

@bobot:
Using a .cmi might be simpler, but most users will expect ocamldep to work on sources, so I prefer to keep it that way. And in fact, extracting that information directly in depend.ml was not too hard, just a question of getting the good model.

For the prefixing, being explicit seems indeed better. To have the full power, these should actually be file paths, so that there is no ambiguity. I think I'll have to rewrite most of that part. Fabrice is right at least on one point: it would be simpler to connect the prefixing to the directory hierarchy, like in Java. The interesting thing is that it could be done somehow in ocamldep alone (ocamlc only sees already prefixed names). Could be a good next step.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 10, 2015

We use -no-alias-deps and it is not clear to me that these changes are really needed. The best way to use -no-alias-deps is to combine it with -o so that the *.ml files have the same names as all references to them from within the same library (e.g. Foo corresponds to foo.ml) which means that ocamldep -modules already produces correct output.

It seems that the changes here are aimed at supporting using ocamldep without the -modules option. Since most OCaml build systems now use the -modules option (and it seems reasonable to expect them to add support for it if they want to easily support -no-alias-deps) I don't think it is worth adding this support.

If we want people to use -no-alias-deps I think the main thing that is needed is to add some support to ocamlbuild. The simplest such support would probably be a prefix tag that would use -o to add a prefix to a module and also create a "map" module mapping all prefixed names, which would be automatically -opened for every file being compiled.

A more usable solution would be some kind of .mllib file that would describe the intended mapping from source files to aliases within a module and ocamlbuild would automatically take care of all the details.

@dbuenzli
Copy link
Copy Markdown
Contributor

For me, it questions the "simplificity" of module aliases as a replacement for namespaces.

This is certainly OT. But for me module aliases have completely failed at solving the namespace problem. Even if you use them as a namespacing mechanism you still pollute the toplevel namespace, hence you still have to prefix your sources with ad-hoc prefixes and worse you actually have to expose these .cmi for the system to work which only multiplies the number of names the end-user sees and he should not since he should access them through the name space. Besides it prevents you from generating good documentation (see http://caml.inria.fr/mantis/view.php?id=6471).

@Drup
Copy link
Copy Markdown
Contributor

Drup commented Nov 10, 2015

But for me module aliases have completely failed at solving the namespace problem.

Still OT, but I agree, for a different reason: such "namespaces" are closed. You can't add more things in it. The fact that this is a problem is demonstrated by libraries such as Core_kernel/Core ....

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Nov 10, 2015

@lpw25 -modules doesn't give you enough information if you are compiling more than one namespace together. That's one of the reason #146 have been created. But I'm not sure that it is even possible to fix without additional information about the namespace since after the opening of two namespaces you don't know from which the modules used after come.

So at the the additions specifically for makefiles are not so big.

@garrigue
Copy link
Copy Markdown
Contributor Author

I am not sure this is related to ocamldep, but just a clarification:
Module aliases never pretended to solve the unique naming problem, they only provide the possibility to create structured views on a flat list of uniquely named modules.
Unique naming should be solved by a socially usable scheme, but I'm afraid there has been little progress on that side.

@Drup Module aliases let you extend namespaces, but only at the granularity of modules, by creating new views. I think there is nothing you cannot do, but this is not a fine grain namespace mechanism. There is a limit on what one can do with one single construct (which doesn't even extend the implementation language!).

@garrigue
Copy link
Copy Markdown
Contributor Author

@lpw25 Your statement is only correct for using module aliases as a replacement for -pack. And it only works if you use a single map file (precluding getting fine grain dependencies between libraries for instance).

Also, my build tool of choice is make, and I have no plan to change that. You're not suggesting to deprecate make support in ocamldep I hope.

If there is no really good idea for how to handle prefixing, I may just scrape it, and include only the map file mechanism. Then one can just handle prefixing through symbolic links for make, or copying to different names for ocamlbuild.

Would it be useful to also allow the map file to be a cmi, like for #146 ?
The use case I see is when you want to depend on changes in an installed library, and you cannot see its map file. However, I'm not sure how much people would need that.

@dbuenzli
Copy link
Copy Markdown
Contributor

Le mercredi, 11 novembre 2015 à 02:07, Jacques Garrigue a écrit :

Module aliases never pretended to solve the unique naming problem, they only provide the possibility to create structured views on a flat list of uniquely named modules.

But then I'm afraid to say that as far as I'm concerned this doesn't really solve any problem and rather — by introducing more names for the same things — only introduces more noise in the system.

I'm sure there are many reasons they are how they are now. But my initial naive expectations of them were that we would be able to write in the structuring module Super :

module Sub : sig … end = My_sub

and then only export super.cmi to the end user, hiding the My_sub.cmi so that they are not visible to the end user and compile the corresponding My_sub.cm{o,x} with some kind of mangling scheme so that their names are made unique in the toplevel name space.

But this is only a naive view not rooted in any form or technical or theoretical knowledge.

Best,

Daniel

@garrigue
Copy link
Copy Markdown
Contributor Author

@dbuenzli

But then I'm afraid to say that as far as I'm concerned this doesn't really solve any problem and rather — by introducing more names for the same things — only introduces more noise in the system.

There was demand for precisely that. Be able to add new names, without duplicating.

I'm sure there are many reasons they are how they are now. But my initial naive expectations of them were that we would be able to write in the structuring module Super :

module Sub : sig … end = My_sub

and then only export super.cmi to the end user, hiding the My_sub.cmi so that they are not visible to the end user and compile the corresponding My_sub.cm{o,x} with some kind of mangling scheme so that their names are made unique in the toplevel name space.

There is no mangling whatsoever. WYSIWYG.
The trouble is that since in OCaml identical interfaces should be identical, and one can compile seeing only the interface, there is no way to create different mangled names for 2 modules using the same interface other than giving them different names to start with.

But you can of course do the following:
Create a new file My_sub2, with the interface you want.
Then write

module Sub = My_sub2

in Super.
What is missing here is the ability to hide intermediate names (for error messages), but the -short-paths option should do that in the next version, if you write My__sub2 with 2 underscores.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 11, 2015

Please, it would be really helpful to keep this PR discussion focused the support for module aliases in ocamldep and the build systems using it (which is a non-trivial question). I will only look at it this week-end, and if it has by then degenarated into a fifty-message battlefield on all things namespace it will be much more time-consuming than it should be to review and discuss Jacques' proposal.

For the record, there has been a discussion of support for module aliases in ocamlbuild/ocamldep on the caml-list back in January and I proposed a patch for ocamldep that is related: #146 .

I am sympathetic to the remark that module aliases fail to be a satisfying solution to the namespace problem both on usability and technical grounds (but it is also the best we could agree on in a long and difficult discussion). Feel free to discuss this, but elsewhere.

@garrigue
Copy link
Copy Markdown
Contributor Author

@gasche do you have some comment yourself?
Particularly with respect to my last proposal of removing all the prefixing stuff, and concentrating only on the -map / -open feature.
My idea is that it should be enough for ocamlbuild, since it could choose to change the names of files when copying them to its own directory. If we can support both ocamlbuild and make this is a good start, since Jenga will probably be able to adapt its own support to that.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 11, 2015

Your statement is only correct for using module aliases as a replacement for -pack. And it only works if you use a single map file (precluding getting fine grain dependencies between libraries for instance).

True, but is that a problem? I'm not aware of anyone who wants to use module aliases for anything other than replacing pack. It is not like -map is a neat and clean solution to the problem, it is much closer to a hack. Hacks are fine when the problem needs to be solved, but it is not clear to me that this problem does need to be solved.

@garrigue
Copy link
Copy Markdown
Contributor Author

@lpw25

True, but is that a problem? I'm not aware of anyone who wants to use
module aliases for anything other than replacing pack. It is not like -map
is a neat and clean solution to the problem, it is much closer to a hack.
Hacks are fine when the problem needs to be solved, but it is not clear to
me that this problem does need to be solved.

  • Better support means that people have more space to experiment.
  • Currently there is no support whatsoever for make and ocamlbuild.
  • -map is not a hack but an accurate inference of the dependencies induced.
    Properly used you can do almost anything with it.

This seems enough to justify a new feature.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 11, 2015

-map is not a hack but an accurate inference of the dependencies induced.
Properly used you can do almost anything with it.

I admit I haven't read the actual patch. How precise is it with respect to calculating module aliases? I assume that any use of functors etc. to create aliases will fail since they require type information.

@garrigue
Copy link
Copy Markdown
Contributor Author

I admit I haven't read the actual patch. How precise is it with respect to
calculating module aliases? I assume that any use of functors etc. to
create aliases will fail since they require type information.

This is not a problem, since you cannot create an alias of a functor
argument or of a functor application. One limitation I see is that you
should read the map files in the right order if you want the dependencies
to be computed transitively. I've not tested that yet.
The other limitation is that the shadowing from open is only computed
correctly for known structures, without functors; this is already an
improvement for ocamldep.

@garrigue
Copy link
Copy Markdown
Contributor Author

I have removed the -oprefix and -no-prefix options.
See testsuite/tests/no-alias-deps2 for how to use this with make, with symbolic links.
Prefixing should rather be handled by ocamlbuild, but I don't know that codebase.

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Nov 12, 2015

How are the error messages located in A.ml when using the symlinks to LibA.ml?

@garrigue
Copy link
Copy Markdown
Contributor Author

How the error messages are located in A.ml when using the symlinks to LibA.ml?

I was not thinking of that.
But actually, you only need the symlinks to produce the dependencies.
After that you can erase them, and compile using the -o option, so that the error messages will be in the right source file.
I will update the example to follow this pattern.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 17, 2015

I spent some time this morning reviewing the proposed change. I don't have a detailed understanding of the implementation (I think some points could be clarified once we agree on the overall design), but I think that it is rather complementary from my #146 design (ocamldep -paths), and the current PR discussion helped clarify the use-cases and overall context of the proposal.

Below is a discussion and comparison of the current state and two patches, and some personal intuitions on where we could decide to go next.

Internal representations and output representations.

What is ocamldep about? ocamldep computes an internal representation of the dependency information of a module, and produces one of several output representations that are meant to be used by consumer willing to access an internal representation.

Internal representations: the current internal representation is a set of free access module names (that occur as the first component of the module paths in a program). In my patch proposal, the internal representation is extended to be a set of free accessed module paths. In Jacques' patch, the internal representation adds to the set of free accessed module names a structured object with a non-trivial tree structure, that he calls the "module resolution map", allowing to resolve module aliases (seen so far in the module or its environment) even if they are at the leaves of a long module path.

Output representations: the current output representations are the Makefile format, ready for consumption by make by doing include .depend, and the -modules option that exposes the set of free accessed modules (that corresponds to the current internal representation). With my patch, the -paths option is added to expose more of the (new) internal representation. With Jacques patch, there is no change in the format of output representations -- but of course the actual outputs are refined by taking the module resolution logic into account.

Module binding structure

Let me remark that there would exist a most general internal representation, of which all other representations are approximations. It is the module binding structure (in short, binding structure), of the program, which is an erasure of the parsed AST to retain only the structure of module operations. For example, the module binding structure of

List.iter ...;;
open Core;;
module L = List;;
L.iter (let open Stack in Set....);;
module Foo = struct
  module Bar = Baz;;
  include Foobar;;
  let x = ...;;
end;;

is something that can be represented as

access List;;
open Core;;
module L = List;;
access List; (open Stack in access Set);;
module Foo = struct
  module Bar = Baz;;
  include Foobar;;
end;;

While ocamldep has traditionally been a very coarse approximation of this binding structure, Jacques' patch is going sensibly farther in this direction.

On user interfaces

Module aliases are difficult to support without complexifying the interface, because providing a reasonable approximation of dependencies requires more information about the current compilation environment (the modules we will be compiling against and, crucially, the module aliases they export) than previously. Ocamldep has always performed a vulgar approximation, but it could get away with it, and doing more complex things with module names makes us more demanding.

I have seen or heard criticism of the complexity of the various flags added by Jacques in this PR, both here in the comments and in other conversations. It is important to understand that there are, in general, two approaches to a user interface in reaction to this need for external information:

[richer output]: produce a more general output that is closer to the true binding structure, and lets post-processing tool compute a better approximation if they are given more information on the compilation environment (for example, ocamlbuild could know about specific .mlalias files that define module aliases in a restricted way, and use that to post-process the -paths output for greater good)

[richer inputs]: give more information to ocamldep itself about the compilation environment, so that he can perform the refined approximation internally, and directly return better output representations in the same format as before.

My patch was going the [richer output] way. Jacques is going the [richer inputs] way, with the -map filename option -- and previously with the -oprefix option. People are (rightly) complaining that this makes the interface of the tool more complex, with a multiplication of numerous option (that could be replaced by more generality in the output). But we should recognize that Jacques has the additional (self-imposed) design constraint that he relies on the "makefile" output representation. He needs something that works out of the box, without post-processing, and thus the [richer input] approach is the only one available. (He gave ground on the -oprefix thing, at the cost of considerable pain preparing the pre-invocation state with symlinks.)

As far as I can tell, the requirement that ocamldep keeps supporting Makefile users is fair and justified, and we should try to respect it in further evolution of the tools. Leo says "this is too complex, let's put the complexity in ocamlbuild instead", but ocamlbuild users will not pay for ocamldep's complexity anyway, and Makefile users may not have the luxury to avoid it.

Going forward

I like Jacques' proposed design. Given the constraints to have a working Makefile output, I don't see how to do better, and in fact it is quite simple. I don't understand the details of the -oprefix proposal, and because he rewrote his post I cannot access them anymore so I won't comment on them, but I would support an implementation of his approach.

Re. the status of -open Lib, we have two choices:

  • have -open Lib range over all following file arguments, which forces, as @bobot remarked, files using different -open choices to be passed as different invocations of ocamldep (I have no particular problem with that choice)
  • or make -open Lib only range over the following file argument and not the other ones, so that one would have to write ocamldep -open Lib foo.ml -open Lib bar.ml (that seems a bit harder to generate programmatically)

If we want to have nice ocamlbuild support on top of Jacques work, we need to introduce a new output representation that reveals the internal representation in a nicely parseable way. One option would be to have -modules and -paths as proposed, and a separate output -aliases for the alias map alone, but this require invoking ocamldep twice, so a representation combining them would be nicer (albeit more complex).

In the future, we should also think of an output representation giving the full module binding structure, to experiment with pixel-perfect dependency computation in OCaml.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 17, 2015

Leo says "this is too complex, let's put the complexity in ocamlbuild instead"

That is not what I am saying. I am saying that the complexity of the "-oprefix" stuff is irrelevant to tools such as ocamlbuild which use the "-modules" argument, whilst the complexity of the "-map" stuff is not needed for the only use case I am aware of: replacing "-pack".

The "-oprefix" stuff is gone now anyway so we can ignore that bit.

I have warmed slightly to the "-map" and "-no-alias-deps" options. However, I think we should be clearer about what problem they actually solve. There is in fact no issue with using ocamlc's "-no-alias-deps" option with ocamldep -- it continues to give a conservative approximation of the dependencies which is correct and sufficient for a correct compilation order.

The problem comes from two related options:

  • turning off warning 49, which lets you reference a module that has not yet been compiled
  • using the -o option, which breaks the simple relationship between the name of the ".mli" file and the ".cmi" file.

Turning off warning 49 gives ocamldep problems as it will over-approximate the dependencies and produce dependency cycles. Using the "-o" option gives ocamldep problems when producing makefile output because it gets the names of the ".mli" files wrong (but does not cause any problems if you are using "-modules").

The reason to use "-o" is so that you can have short source file names whilst still having long (and so more likely to be unique) module names. The reason to turn off warning 49 is to create a "map" module that aliases the short names to the long names, which can be used (via "-open") to keep all references between the modules using the short names. This allows you to keep your source files looking just like they do when using "-pack" whilst actually using module aliases.

When compiling using the approach just described you do not need to process the "map" module with ocamldep since it is known to have no dependencies (and is probably automatically generated by the build system). Meanwhile ocamldep with "-modules" still produces correct output for the rest of the modules. Without "-modules" the output is not quite correct because it uses the short module names for the names of the .cmi and .cmo files.

The "-map" and "-no-alias-deps" options aim to allow other (as yet unspecified) approaches based on allowing user-written "map" modules, which -- as with the "map" files in the approach above -- are to be compiled with warning 49 off and used via "-open" to provide some aliases to other modules. The "-no-alias-deps" option is used on the map module to avoid creating unnecessary dependencies. The "-map" option is used on the modules using the map to add additional dependencies which are needed to replace some of those taken away by "-no-alias-deps".

It is still not clear to me what the practical use case for these two options is, but I agree that they at least make sense and are not a total hack. I would prefer them to have names that better reflect what they do -- in particular I think that "-no-alias-deps" is not an accurate name, perhaps "-for-map" would be better. I also think that their behaviour should match each other precisely -- "-no-alias-deps" should only remove aliases which will be correctly interpreted by the "-map" option.

@garrigue
Copy link
Copy Markdown
Contributor Author

Leo's understanding is mostly right, but

  • -map does not rely on -open, in the meaning that you can also choose to manually open the map module inside files.
  • -no-alias-deps tries to really do the same as ocamlc's option. However, when used on implementations it is not conservative, because it assumes that the interface does not hide anything, i.e. that the aliases modules are not coerced, which forces to enforce dependencies early. This could be improved by taking the interface into account, but at this point I'm not sure it's worth it.

Concerning the -paths option, I don't understand enough of how ocamlbuild works to see how this combines. However, I can explain a bit more what module_map contains, which may help see how it could be exported. Basically, this is a map from paths to sets of compilation units, which one can use to compute the dependencies induced by referring to some path. Each path in module_map has one or more registered dependencies: for a root module, it depends on the corresponding map file, and for a submodule it may additionally depend on the file containing the aliased module. Submodules always have all the dependencies of they parents, plus their own dependencies.
There are two ways to compute the dependencies of a path P.M when it is used. Either you are in a binding position, and you add as immediate dependencies the dependencies registered in the map for the parent path P, and copy to the local map the dependency tree below P.M. Or you are in a coerced position, and you must add the dependencies of P.M, and all the submodules below (because you have no way to delay these dependencies anymore).
Hope this clarifies what the raw information means, and how one is supposed to use it.

@bobzhang
Copy link
Copy Markdown
Member

My question: for map -map lib.ml, do we need full ocaml langauge support in the map file, how about a json file instead -map lib.json. To make it simple might help other efficient tools developed in the future

@garrigue
Copy link
Copy Markdown
Contributor Author

@bobzhang: I think this is irrelevant. ocamldep is about computing dependencies for OCaml programs, so it parses OCaml code. The code which handles map files is exactly the same as the code which computes dependencies for normal files.
If other tools would prefer another format, it could be possible to have ocamldep output that format, as @gasche already discussed in this thread.

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Nov 19, 2015

I think the need to prefix the file using copy/link just for ocamldep is a lot too complicated. For Frama-c we can't switch from pack to module alias with this problem. We should be able in the Makefile output case to give more input informations to ocamldep about the mapping from filesystem to ocaml module.

I propose to reduce the amount of options to give by using the usual convention to separate files into directories: only one boolean options -map-directories which tell to use the map file for doing file translation for the directory in which it is and all the subdirectories until there is another -map file. @garrigue, I will create a separate merge request if you prefer.

[edited for rephrasing the definition of the option]

@garrigue
Copy link
Copy Markdown
Contributor Author

@bobot:
I'm not sure I understand your problem with links. My reason to use them is precisely because it's trivial to generate the links in the Makefile, and to remove them after generating the dependencies.
I'm not really opposed to adding a mechanism for file name mapping, such as reading a file containing pairs of names, but it requires to rewrite a large part of ocamldep.ml, and it will always look clunky.

For the -map-directories option, I don't think it should go into ocamldep. In general, there may be ways to make things simpler for conventional cases, but I don't think this should go into ocamldep, as this is not related to the language itself. It would be better to write a wrapper that calls ocamldep with the right options.

@garrigue
Copy link
Copy Markdown
Contributor Author

Some further thought about -o used to rename the output. This was introduced in 4.02 to ease the transition, and it appears that Jane Street could use it successfully. However, I don't think this is a robust approach: requiring all tools to support it would be a pain (ocamlbrowser, ocamldoc, etc...)
An ocamlbuild-like approach, of copying (or linking) all the relevant files to a compilation directory, giving them the right names, looks much more sound to me, without restricting expressiveness.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 19, 2015

We have around one month to be in time for the feature freeze of 4.03.

I would support a first patch series that does not handle the -o part (that is: ask code authors to use the long module names, rather than the short module names, as the filenames of the .ml files; they can use the short module names in the programs that open (in source or on the command-line) the name-mapping module(s)). Whatever good idea we find to support -o can be built on top of that, and I think it will be easier to build strong consensus on the simpler proposal first.

(This is not to say that @bobot 's point that -o allows to transition from -pack to -no-alias-deps is invalid. It is a very good point and it would be nice to find a solution for this. This may or may not be done in time for 4.03.)

My understanding is that Jacques' current proposal is such a proposal. Besides building better ocamlbuild support on top of it, I would like to better understand the internal details of the implementation (I think Jacques' current code is a bit too terse and more comments would help; his comments in the discussion above are already very helpful). If I find time to do a deeper code review I will try to contribute such clarification comments or maybe propose surface changes. This would also be helpful towards an output mode exporting the richer reprensation for other build system support.

Concerning the -paths option, I don't understand enough of how ocamlbuild works to see how this combines.

If the -modules output is consistent with the refined information you get thanks to the presence of the -map argument, then it may be the case that ocamlbuild already works better thanks to your changes. We should test it first.

@garrigue
Copy link
Copy Markdown
Contributor Author

OK, I hope we can converge.
Is everybody happy now with the part currently implemented ? (i.e. only -map for keeping a file in the environement, and -no-alias-deps to add less dependencies)
The names may no be perfect: -map could be named -read-aliases and -no-alias-deps here is about compile time dependencies, while for ocamlc it is mostly about removing link time dependencies (but there is still an intersection with what the -no-alias-deps -w -49 of ocamlc does, i.e. only depend on a file if we need to look inside it).

Concerning output-renaming, I've thought about it a bit more, and my conclusion is that if we add something, this should be something easily handled by all source handling tools (ocamldoc, ocamlbrowser, ocamlbuild, ...), and maybe even the compilers.
A possibility, which I think is close to what @bobot had in mind, would be to have a special ocaml_srcmap file per directory. It would contain a mapping between source and object file names. For instance, for the no-alias-deps2 example:

lib <- lib     # could be omitted
LibA <- A
LibB <- B
LibC <- C

There would be no extra option to read it: by default ocamldep (and other tools which support it) would look for this file for each directory in the path, and read it.
It is important that it is a per-directory file, for two reasons. One is that you may have source files with the same name in different directories, so the resolution must be local to the directory. The other one is that we don't want to force all tools to understand the directory hierarchy.
Having a standard implementation to parse it and doing lookup in it, available as a compiled module with the compiler, would allow other tools to use it easily. It would also allow to extend its syntax if deemed necessary (doing something more namespace-like?).
Would such a solution fit the bill?
This should go into another pull-request.

@garrigue
Copy link
Copy Markdown
Contributor Author

Did you read my argument for ocaml_srcmap ?
The point is that this information should be easily accessible from other tools.
Of course the ocaml_srcmap file can be generated from something else. But if you don't leave it around, tools like ocamlbrowser or ocamldoc cannot work properly.
Also, passing something manually means that in order to do anything about your sources, you already need to have some intimate knowledge of where is the map file, and what it applies to.

Honestly, passing an extra argument might be an option for ocamldep (and it would still add complexity), but it is completely unreasonable for other tools.

@garrigue
Copy link
Copy Markdown
Contributor Author

I would of course gladly accept there suggestion that it is urgent to do nothing.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 28, 2015

Indeed, I think that merging what you already have, and then trying to support this in ocamlbuild is more urgent than giving short filenames to compilation unit sourcefiles.

Honestly, passing an extra argument might be an option for ocamldep (and it would still add complexity), but it is completely unreasonable for other tools.

If the alternative is ugly implicit state, I would rather have nothing at all.

@garrigue
Copy link
Copy Markdown
Contributor Author

Other people may have other opinions. In particular people working on large codebases. That's why I'm asking the question here.

For ocamlbuild, I have never used it, and cannot understand a configuration file, so that's going to be hard...

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Nov 29, 2015

I think Frama-C and all the external plugins can be qualified of large code base. We are not migrating from pack (that have their own problems) to module alias if we have to prefix manually all the files with the name of the plugin. So we need renaming. However even if it is very needed, for us, it is not urgent enough for doing something that we will regret later.

About ocaml_srcmap, I don't understand the argument that ocamldoc should be treated differently than ocamldep

Honestly, passing an extra argument might be an option for ocamldep (and it would still add complexity), but it is completely unreasonable for other tools.

We already need to give a lot of options to ocamldoc (at least all the -I), so I think uniformity and control is preferable. And why adding another file format, if all the information can be found in a common -map option?

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 29, 2015

At Jane Street we already have our own approach to these issues, so the renaming features are not urgent for us. I think doing nothing for now is probably the best option.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 29, 2015

@garrigue I think an earlier question of mine has been skipped over in this discussion, and I think it should be resolved before merging:

The thing that is still not clear to me is whether the behaviour of -no-alias-deps precisely matches up with what is done by the -map option.

The aim of the -no-alias-deps option (as I understand it) is to remove some dependencies from the file on the basis that -- as long as any other file which depends on this file uses the -map option with it -- no genuine dependencies will be dropped.

For this to be safe in general requires that the dependencies removed by -no-alias-deps are only those from aliases which will be correctly interpreted by -map. I haven't been able to work out, from the discussion so far, whether or not this is the case.

@garrigue
Copy link
Copy Markdown
Contributor Author

@bobot This may be more important for ocamlbrowser than ocamldoc, because for ocamlbrowser you just give a list of directories. Merlin also could be an example. In general, if object files use names different from their source file, in a sane ecosystem there should be a stable way to map between them, without having to point to an explicit file. By the way, my use of a different format for the file was just pointing that its semantics is different from an ocaml file, but it could clearly use the same syntax. The difference in semantics is that an ocaml file can also express more complex namespace constructions, such as nesting, so there is no reason the source map and the module map shall be identical.

@lpw25 I thought the renaming from -no-alias-deps to -as-map had solved the problem: -as-map requires -no-alias-deps for the compiler, but is not strictly identical: (1) it may forget some dependencies because it doesn't take coercions from the mli into account, and (2) it only cares about compile time dependencies, while -no-alias-deps is also about avoiding link time dependencies (one can always break a compile time value dependency with an opaque interface). I think that (1) is ok, because we are talking only about map files here, for which this restriction is reasonable. Since ocamldep is doing an over-approximation for coercions, it is hard to have weak enough dependencies without this leeway.

@bobot
Copy link
Copy Markdown
Contributor

bobot commented Nov 30, 2015

I think it is not a problem for merlin,since it uses the compiler module resolution machinery and reads only cmi and cmt*. I think the goto definition use the information in the cmt* that will rightly points to the not prefixed file. I think opam-doc also reads only compiled files. Perhaps a confirmation of the authors would help.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 30, 2015

This may be more important for ocamlbrowser than ocamldoc, because for ocamlbrowser you just give a list of directories.

Maybe we could store both a long/internal name and a short/external name in .cmi and .cmt files. OCamlbrowser could then use the short names. In this direction, I would like to study the possibility to add arbitrary "internal suffixes" to module names, to avoid linking-time name conflicts, but not in the 4.03 timeframe.

OCaml users are demanding more complex compilation environments (by compilation environments I mean the mapping from module paths to compilation units). If they want to use these richer features, they need to give more information to their tools. We should make it easy to give more informations, rather than trying to hack global conventions to pretend that this demanded sophistication do not exist -- or if we decide to do that, we could at least choose a sensible convention such as following the filesystem hierarchy, instead of a hastily-baked ad-hoc ocaml_srcmap thing.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 30, 2015

Perhaps a confirmation of the authors would help.

I can confirm that codoc (formally opam-doc) only reads .cmi and .cmt(i) files.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 30, 2015

I thought the renaming from -no-alias-deps to -as-map had solved the problem: -as-map requires -no-alias-deps for the compiler, but is not strictly identical

Sorry, my comment was not clear. It was referring to the ocamldep option formally known as -no-alias-deps (i.e. -as-map) rather than the ocamlopt option -no-alias-deps.

The concern is about whether dependencies removed by -as-map will always be re-added by -map.

I'm assuming that it is possible for a module Foo to expose some module alias without -map foo.ml being able to determine that this is the case because ocamldep is conservative and cannot access the type environment. If this happens, and -as-map has removed the dependency caused by this alias, then ocamldep will have dropped a genuine dependency.

@garrigue
Copy link
Copy Markdown
Contributor Author

The concern is about whether dependencies removed by -as-map will always be re-added by -map.

They will.
Let me explain a bit how the analysis works.
What the Depend module does is build a map from modules to their delayed dependencies, i.e. dependencies that are triggered when one accesses a module alias. Since this map is built incrementally, by reading the -map arguments, and then by parsing program files, the information is propagated through aliasing: an alias of an alias will just copy the original dependencies (I also add a dependency on map files containing the alias for completeness). As long as everything is transparent, you can go on propagating for ever, but when there is a coercion, or when a module is passed as argument to a functor, all the delayed dependencies are applied as dependencies of the current file. (I do not try to infer whether some dependencies could be left delayed or dropped.)
Now the main difference between using or not using -as-map is what to do at the end of the file. With -as-map, it is assumed that there is no coercion, so these dependencies are left delayed, and when you use -map on the same file, the same dependencies will be propagated to all the files which access it. For a normal implementation file, it is seen as coerced, so that all the dependencies are immediately actualized. So this is rather conservative in the safe way: you get more dependencies than strictly needed.

As for aliases appearing in an unexpected way, without being detected, the only way I would see it happening is through module types, as they are not tracked explicitly. I.e., the idiom of, in an mli map, first writing a module type, and then including it, is not supported at this point. But it would have to be contrived (with module types coming from another file), because aliases in a module type declaration are added as immediate dependencies.

For simple maps, including nesting and inclusion, I think that the current approach is more than sufficient.

@garrigue
Copy link
Copy Markdown
Contributor Author

@gasche I completely agree with you that there is no hurry to work on file name mapping at this point. But I completely disagree on all the rest. We can restart this discussion later.

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Nov 30, 2015

@garrigue Thanks that answers my question perfectly. I support merging the -as-map and -map parts.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 30, 2015

(So do I.)

garrigue added a commit that referenced this pull request Nov 30, 2015
Add module alias support to ocamldep, and update documentation.
@garrigue garrigue merged commit f687907 into ocaml:trunk Nov 30, 2015
@mshinwell
Copy link
Copy Markdown
Contributor

I think this is broken. Maybe the CI should test ocamldep?
I confirmed that this works if I revert 381328e

$ ocamldep.opt -modules -impl r.ml
File "r.ml", line 1, characters 0-3:
Error: Syntax error
r.ml:
$ cat r.ml
let foo () = ()

@garrigue
Copy link
Copy Markdown
Contributor Author

garrigue commented Dec 1, 2015

It does. And I just checked now that doing make depend in the main ocaml directory works (it calls the new tools/ocamldep). So this is rather strange.

@garrigue
Copy link
Copy Markdown
Contributor Author

garrigue commented Dec 1, 2015

Sorry, there is indeed a problem with ocamldep.opt. Very strange, as this is just the native code version of the same program.

@garrigue
Copy link
Copy Markdown
Contributor Author

garrigue commented Dec 1, 2015

The problem doesn't exist in bytecode, and goes away if you don't use -impl. Looks like some problem elsewhere in the runtime system. I checked that if you build the bytecode version with the new ocamlc, you get the same problem. Could you try to see if a bootstrap solves the problem, or creates more problems?

@garrigue
Copy link
Copy Markdown
Contributor Author

garrigue commented Dec 1, 2015

The previous comment is not very clear. The discrepancy comes from the fact that ocamldep is built using boot/ocamlc (old compiler) while the native code version uses ../ocamlopt (new compiler).
One gets the bug in bytecode if you use the new ocamlc, but I don't think that changes in ocamldep are the real cause, maybe some changes in the way arguments are processed by the Arg module?

@mshinwell
Copy link
Copy Markdown
Contributor

I will investigate further. I think the compiler I used had been bootstrapped.

@mshinwell
Copy link
Copy Markdown
Contributor

jdimino spotted this. It's an error in the ocamldep patch; it was parsing the .ml file as if it were an .mli. See commit f3ba667.

@garrigue
Copy link
Copy Markdown
Contributor Author

garrigue commented Dec 1, 2015

Sorry. Thank you. (Looks like I was in a directory where ocamldep had not been recompiled... gah)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants