Conversation
|
Also @dbuenzli |
|
xref #7080 Any reason why you don't simply specify the files as positional arguments like other compilation objects ? (rather than overload |
If we use positional arguments it is a bit less clear what to do in some situations. For example, suppose the compiler is invoked to link something. Then So I as a first step I reused the |
I may be forgetting something but I think it is unambiguous:
|
utils/load_path.ml
Outdated
| else if Sys.is_directory dir then | ||
| Dir, Sys.readdir dir | ||
| else | ||
| File, [|Filename.basename dir|] |
There was a problem hiding this comment.
I'm not very familiar with the compiler code base, but where do you make sure that this file actually exists?
There was a problem hiding this comment.
Looks better, although I wish there was a stricter mode. In other words, the compiler would error if provided files that don't exist. This would be best for directories as well obviously, but there are some compatibility concerns. From the point of view of the build system, it would be more helpful if compiler is lax about this stuff.
There was a problem hiding this comment.
We expect the number of .cmi/.cmx files passed on the command-line to be potentially very large (transitive dependencies), especially if the feature is used to implement libraries in the build system. So I'm slightly concerned by the extra cost induced by those Sys.is_directory and Sys.file_exists. Remember that we observed significant gains when we switch from file lookups (admittedly in may directories) to reading whole directories at once.
Cannot we not add any such calls? Just remember which .cmi/.cmx file have been added (based on their extension, or specific CLI options...) to make them available if the compiler needs them.
Yes, I think this makes sense. |
trefis
left a comment
There was a problem hiding this comment.
In order not to pollute the patch, I left for later renaming dir for something more appropriate reflecting that now both files and directories can be used in the load path.
I think that's a mistake.
The code just looks weird as a result of this PR.
utils/load_path.ml
Outdated
| module Dir = struct | ||
| type kind = | ||
| | File | ||
| | Dir | ||
|
|
||
| type t = { | ||
| kind : kind; | ||
| path : string; | ||
| files : string list; | ||
| } |
There was a problem hiding this comment.
Please do change this in this PR.
There was a problem hiding this comment.
I renamed "dir" by "entry".
|
Also, could you add tests? (That |
|
I don't think |
|
@lpw25 could you please expand on what it means to be essentially consumed ? The digest of As a cli user is see the compiler as a function taking input files and producing new ones. The type for these files can be detected by their extension and their purpose unambiguous in each of the tool's mode of operation, as such I don't see why we can't simply specify them as positional arguments. |
|
And btw. I would actually rather be against overriding |
|
@dbuenzli What I'm trying to get at is that all OCaml's source files and intermediate files get used as positional argument exactly once -- when they are the main focus of the command and are being processed in some way to produce the output. Whereas .cmi files are used by many different compilation commands in order to look things up from the interfaces. They are being used to help compile other modules, but they are not themselves being compiled. |
|
I agree that it would be better to use a new parameter name rather than overloading |
|
@lpw25 I still don't see what kind of benefit you get from that perspective as far as cli usability is concerned. Your goal can't be achieved since compiled objects as positional arguments can already be used to add them to an archive or link them into an executable. So you already have two different positional meaning here for an object. It seems to me that the line you are trying to draw is quite a brittle one. For example why do you mention If people feel strongly about that I'm fine about another cli flag, I just find it totally redundant and unnecessary since the compilation mode and extensions allow us to understand what is happening without ambiguity. |
To me it seems both an obvious difference and a fundamental one. I'd find it really strange to pass cmi files positionally to a compilation command. Maybe another way to look at it: if I passed in additonal cmi files it would not change the output. That is not true of the other positional arguments.
They aren't allowed to be positional either for the exact same reason. |
You didn't mention them in that message
So if a file is tagged with a cli flag, then it may or it may not change the output. If it's positional then it has to change the output. Well I don't know where you get these conventions from but it doesn't change anything as far as the compiler and the user are concerned and it's not helpful in anyway. From an incremental build perspective what I want is simply the ability to specify which files I consider will be read by the compiler. After that what I'm really, really, really interested is knowing whether an input file actually influenced the output not, not whether it might have done so or not. |
|
I've played a bit with the compiler's command line, and it happens that (for Also, the error message for giving a The problem is different with (I'm fine with both the approaches of @dbuenzli and @lpw25. I just thought I'd point out that the current behaviour of the compiler might not be the best reference.) |
|
Ok, one last attempt to express the condition in a way that makes it clear: When you call a compiler the positional arguments should be the files that are getting compiled. When you call a linker the positional arguments should be the files that are getting linked. They don't need flags because the action to be performed is implied by the name of the tool itself. Anything else the compiler or linker might require needs to be flagged to indicate what role it has in the process. You are not compiling those |
|
Ok so let me clarify where exactly we differ.
This is were we differ. This is some kind of esthetical rule which has no fundamental cli usability justification.
The role of each file can be clearly identified by its extension so flagging them is redundant from a cli interaction perspective. |
|
I don't have a strong opinion on the cli debate, but just one argument in favor of positional arguments: they make it easier to use shell wildcards (e.g. |
That is only the case if the role is both unique and obvious. It is not clear that holds for cmi files. For example, when using |
My suggestion is of course only if it's clearly unambiguous in all of the cases. Otherwise a new option should be added but I think it should be short and unique (if we can assume the "side" role will be unambiguous according to the phase). I would have proposed |
I added a test, thanks. |
|
If I followed the discussion correctly, the idea of passing cmi/cmx files as positional arguments has been abandoned. So the remaining propositions are:
For which I'm aware of the following opinions:
I personally believe that this new option will primarily be used by build systems, not by end users, so the length of the flag doesn't matter so much, and we can afford to provide something verbose but easily readable / understandable. (Which does not mean that I believe Would other participants in this discussion (in particular, @nojb) care to voice an opinion so we can move forward with this PR? |
|
I also dislike short names (I spend more time reading logs of command-line invocations than writing them, so for me clearer names are better). I'm perfectly fine with |
|
I have no strong opinion on the naming for options. I'm rather against -I for the following reason: it forces at least one syscall per command-line option (to read .cmi files available in the directory), unless we decide that directory names cannot have a .cmx/.cmi extension. For performance reasons (esp. under Windows), I expect that a .cmi/.cmx file passed to the compiler should not produce any syscall if that file is not actually looked up. The build system might end up passing hundred or thousands of files on the command-line if it chooses to re-implement the notion of libraries itself. Even if it does some dependency analysis first, it will still need to pass the transitive closure of immediate dependencies (e.g. to resolve type aliases). |
That's of course not the point, the point is to have readable logs. Compactness is also part of readability. In particular having |
|
I think we should not duplicate the extension in the command line flag since it is redundant. What about |
|
I'd rather keep these uppercase letters for dirs. What about |
|
Remark: I believe that the two last models of @dbuenzli ( |
Indeed! Thanks for pointing it out. This means that the current implementation fits @dbuenzli's (second) proposed semantics. |
|
I realize that my attempt at thinking about the problem and giving an opinion may have held back work on this PR, which is counterproductive as I think it is a very nice feature to have. Let me summarize my understanding, and make it clear that I would welcome moving forward:
|
|
Thanks @gasche for the comments. I am coming back from the holidays and I expect to be too busy to work on this for the next week or two, but will take it up again after that. |
|
As a way of pinging this PR, let me give my opinion: the latest proposal by @dbuenzli and @nojb is the one I find the easiest to understand. And it doesn't have "weird lookup-semantics discontinuities", the distinction between basename queries and path queries is only an optimization, not part of the semantics, and probably only worth documenting as a side note. |
|
That's great, thanks for the feedback, let's go for that! |
|
@nojb would you be available to revive this PR? It seems that we have reached a weak consensus on the semantics, so the remaining bits should be the easy ones! (famous last words, etc.) |
Yes, I haven't forgotten about it... I was meaning to get back on it. I will try to do so soon. |
3433495 to
32b5c39
Compare
|
@nojb does it make sense to add a toplevel directive to include a single .cmi? The equivalent of |
|
I discussed this PR with @Octachron a few weeks ago. We're all very busy but we think that it is a nice improvement, and would like to progress towards upstreaming. |
Thanks for the ping, and sorry for the lag. This has thoroughly bitrotted, but I will rebase it when I get a chance, hopefully soon. |
|
(New gentle ping, even though there is absolutely no pressure.) |
|
I think this PR has gone beyond its shelf life; unfortunately I don't seem to be able to find the time to get back to it. Closing! |
|
I'll try to see if I can push this to completion myself. |
This PR proposes a way to add a single file to the load path (which up to now consisted of a list of directories). The reasons for doing so are multiple but they revolve around the need for greater control over dependencies.
For the user interface, I reuse the
-Iflag, but for files:-I foo, wherefoois a filename. Here,foocan be any file, but the situation to keep in mind is whenfoois a.cmiand/or.cmxfile.The following interaction gives a sense of what this patch does:
In order not to pollute the patch, I left for later renaming
dirfor something more appropriate reflecting that now both files and directories can be used in the load path.Opinions welcome!
@trefis @alainfrisch @diml