Skip to content

[RFC] .cmt support in caml-types.el#569

Closed
ghost wants to merge 9 commits intotrunkfrom
unknown repository
Closed

[RFC] .cmt support in caml-types.el#569
ghost wants to merge 9 commits intotrunkfrom
unknown repository

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented Apr 29, 2016

(See MPR #4369 and MPR#5812 for the discussion on Mantis)

The purpose of this PR is to gather feedback on an approach to enable caml-types.el to make use of .cmt files.

A new tool tools/cmt_lookup is introduced to look up type and identifier information in cmt files (since these files contain marshalled structures they cannot easily be read by emacs). caml-types.el is adapted to use this tool for the show-type (look up a type using C-c C-t) and show-ident (look up scope information using C-c C-l) functionalities.

Advantages of using .cmt files

  • No need to generate .annot files which take quite a lot of time and space to generate, specially if using many large structural types (like objects).
  • The possibility of more precise and richer information vs the information contained in .annot files. While not yet implemented in this PR, it would be easy to add features such as jump-to-definition, etc. thanks to the additional information found in the .cmt files.

Disadvantage of the current approach (maybe)

  • .annot files can be cached on the emacs side since they are textual; by contrast cmt files need to be read and parsed anew each time. However this seems to work fast enough in practice, especially if using the native-code version of the tool.

Other possible approaches/improvements

  • Make the tool a long-running process to cache the cmt file inside it.
  • Produce the .annot file directly from the cmt file (like in the tools/read_cmt tool) and read that from emacs. This negates the advantages of using cmt files in the first place (better type information, etc.)

What about ocp-index ?

  • While ocp-index uses .cmt files its existing code base is not geared towards lookup of information based on location, but rather towards more global features like completion (see also Show type seems very buggy OCamlPro/ocp-index#18). Moreover, I think that caml-types.el is the common denominator in terms of editor support and in my view there is a need for it to work with .cmt files.

@alainfrisch
Copy link
Copy Markdown
Contributor

No need to generate .annot files which take quite a lot of time and space to generate, specially if using many large structural types (like objects).

As an illustration, we have one reasonnably-sized module which results in a 10Mb .annot file (2.5 Mb .cmt; 700kB .obj), and 30% of compilation time for this module is spent generating the .annot file. Moreover, this .annot file seems to blow up the regexp parser in .emacs, so it cannot even be used for type feedback.

One should mention that one drawback on relying on an external tool is that emacs must be able to locate it and pick the tool from the exact same version as the compiler used to produce the .cmt file. This means that except if some other machinery is put in place (e.g. specifying the location of this tool in some special file e.g. at the project root), it won't be possible to use a single instance of emacs to edit files compiled with two different versions of OCaml. Probably not a big deal for most people.

@alainfrisch
Copy link
Copy Markdown
Contributor

Produce the .annot file directly from the cmt file (like in the tools/read_cmt tool)

Or even tools/cmt2annot 😄

@trefis
Copy link
Copy Markdown
Contributor

trefis commented Apr 29, 2016

While I'm completely in favor of the extinction of .annot files I'm a bit surprised by this PR.
While the current trend seems to get tools (camlp4, ocamlbuild, etc) outside the official distribution, here you propose to add a new one with a justification that I find rather weak.

Indeed your argument for having this new tool is: "yes ocp-index already exists, but it doesn't do what I want". And for having it inside the distribution you say

Moreover, I think that caml-types.el is the common denominator in terms of editor support

Regarding your first argument, I wonder why you don't instead contribute the feature you want to ocp-index? (also the link you gave us is 3 years old, so maybe the issue is fixed now?).

The second argument is typical of emacs users. What about vim, sublime-text, atom, eclipse users?
A lot of efforts has been put into this kind of tooling in the past years (camlspotter, typerex, ocp-index, merlin) and they didn't seem to suffer from living outside the distribution (on the contrary, people would argue that it makes it easier to improve these tools).

@ghost
Copy link
Copy Markdown
Author

ghost commented Apr 29, 2016

Hi @trefis; please note in principle I am not proposing this for inclusion in the official distribution. Rather, like I wrote above,

The purpose of this PR is to gather feedback on an approach to enable caml-types.el to make use of .cmt files.

In fact it was suggested by @gasche in MPR#4369 to open a PR in order to discuss and gather such feedback.

Regarding the issues you raise:

Regarding your first argument, I wonder why you don't instead contribute the feature you want to ocp-index? (also the link you gave us is 3 years old, so maybe the issue is fixed now?).

As far as I know there has been no progress in the last 3 years on this front. ocp-index addresses different set of features and it has its own problems due to this (it is quite slow on Windows when dealing with a large source tree). The objective here is to replicate the functionality (plus epsilon) of caml-types.el using .cmt files, nothing more.

The second argument is typical of emacs users. What about vim, sublime-text, atom, eclipse users?
A lot of efforts has been put into this kind of tooling in the past years (camlspotter, typerex, ocp-index, merlin) and they didn't seem to suffer from living outside the distribution (on the contrary, people would argue that it makes it easier to improve these tools).

  • As mentioned above, I am not proposing this for inclusion.
  • Note that it would be quite easy to add the same support to all the other editors; all that is needed is someone motivated to do it.
  • typerex: as far as I understand this is another name for ocp-index, which was discussed above.
  • merlin is great but quite heavyweight when all you want is to print types. Also I have found it hard to build it on Windows.
  • camlspotter: I have never used it, maybe someone can explain what it does/how does it work ?

Thanks!

@alainfrisch
Copy link
Copy Markdown
Contributor

The fact is that caml-types.el is part of the core distribution and that many people actually use it. It relies on .annot file but we would like to discourage people from using -annot. My understanding is that the tool proposed here is a way to modernize the "official" emacs mode for ocaml. Perhaps this mode itself should be moved itself out of the main distribution and distributed as a standalone package (together with the tool proposed here), but as long as it is not the case, I wouldn't be shocked if the tool were included.

@dbuenzli
Copy link
Copy Markdown
Contributor

The fact is that caml-types.el is part of the core distribution and that many people actually use it. It relies on .annot file but we would like to discourage people from using -annot.

I'm using it. I stopped generating .annot files when @let-def cornered me and forced me to install merlin.

However there's one thing that was nice with .annot files is that IIRC you'd get best-effort information with show-type when your program refused to typecheck which is the exact moment when you need the feature most to explore the type of values around the failure.

It seems to me that tools like merlin still need to type-check (or maybe have previously type-checked once ? I don't remember, it's magic). Does the switch to .cmt files here keep that ability or are .cmt files just not able to represent that kind of partial information ?

@ghost
Copy link
Copy Markdown
Author

ghost commented Apr 29, 2016

Yes, .cmt files keep information from partially type checked files and this tool uses it.

@Chris00
Copy link
Copy Markdown
Member

Chris00 commented May 1, 2016

Tuareg uses caml-types.el and it would be great to be able to use .annot and .cmt files out of the box (whether merlin is installed or not).

@Chris00
Copy link
Copy Markdown
Member

Chris00 commented May 1, 2016

One should mention that one drawback on relying on an external tool is that emacs must be able to locate it and pick the tool from the exact same version as the compiler used to produce the .cmt file.

@alainfrisch Tuareg already tries to detect the proper environment to set up for the compilation to be in sync with opam switch. Instead of simply failing, one should design the tool so that it reports the compiler that one should use (in this case, Tuareg can try to switch using it if is is installed with opam or if the tool has a name cmt_lookup_⟨compiler version⟩ and is in the path). I think that quite a few people use opam switch and it would be nice that they are not left wondering why some tool is not working...

@gasche
Copy link
Copy Markdown
Member

gasche commented May 1, 2016

@Chris00 an excellent remark; My own #463 is a step in this direction, but currently it only says "this magic number comes from an {older, more recent} OCaml version". For older versions it would be easy to keep a log of older magic numbers to predict the specific version to use. I'm not sure what to do for future versions (it's not really possible to guess which magic number future releases will have).

For the specific case of .cmt(i) files, as opposed to arbitrary compiled products, the easiest way may be to embed the OCaml version number in the cmt_infos, at a point that is stable for all future magic numbers. I don't think it is currently available, but enabling it starting in 4.04 would allow us to accurately predict even future versions.

@trefis
Copy link
Copy Markdown
Contributor

trefis commented May 1, 2016

2016-04-29 14:45 GMT-04:00 Nicolas Ojeda Bar notifications@github.com:

As mentioned above, I am not proposing this for inclusion.

That wasn't clear to me, apologies.

Note that it would be quite easy to add the same support to all the other editors; all that is needed is someone motivated to do it.

:)

2016-04-29 16:08 GMT-04:00 Nicolas Ojeda Bar notifications@github.com:

Yes, .cmt files keep information from partially type checked files and this tool uses it.

It will only contain information about the prefix of the file which
typechecked successfully. Let's not forget that the current compiler
stops at the first error.
I'd be surprised if that was the "best effort" that Daniel has in
mind. I think what he is referring to is the fact that if you leave
old cmts around, and your code hasn't changed much you get access to
outdated-but-good-enough type information.

@ghost
Copy link
Copy Markdown
Author

ghost commented May 1, 2016

As far as I understand, both .annot and .cmt files contain information about the part of the file that was successfully typechecked up to the first type error. In fact the information both both files is tracked together (see eg https://github.com/ocaml/ocaml/blob/trunk/typing/typecore.ml#L112).

I don't understand the remark about keeping old .annot files around since they are generated even when there is a type error so there is no easy way to do that generally.

beforedepend::

install::
cp cmt_lookup cmt_lookup.opt $(INSTALL_BINDIR)/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should protect $(INSTALL_BINDIR)/ against spaces — the build fails on Windows because of that.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It is fixed now.

@lefessan
Copy link
Copy Markdown
Contributor

lefessan commented May 1, 2016

If we include "cmt_lookup" in the distribution, and update caml-types.el to use it, people will soon start improving it (with PR), until it becomes either Merlin or ocp-index, or they will get frustrated with a half-baked tool. This is exactly the opposite direction of what we are currently doing, i.e. moving things out of the distribution. If we add this, then we should also think about adding ocp-indent to the distribution as an improvement for caml-mode.el broken indentation support.

Why not modify caml-types.el instead to check for the presence of external tools (merlin, ocp-index, cmt_lookup, etc.) and use them when they are available (and .cmt files also) ?

@ghost
Copy link
Copy Markdown
Author

ghost commented May 1, 2016

Hi @lefessan,

  • merlin is much more powerful than caml-types.el and if you are using it you don't need/want caml-types.el anyway;
  • ocp-index provides a slightly different set of features than caml-types.el (you get completion but not types for subexpressions)

So I am not sure what would be the point of interfacing these other tools with caml-types.el.

The point of this PR is to get caml-types.el (which is used by a lot of people without any extra "advanced" tools like merlin, etc.) to use .cmt files, nothing more.

Having said that I do agree that putting this inside the distribution goes against the direction in which things have been moving. But then the status of the emacs directory needs to be clarified: does it really make sense to keep these files in the distribution if they are not being updated to take advantage of features introduced 2 or 3 compiler versions ago ?

Thanks!

@gasche
Copy link
Copy Markdown
Member

gasche commented May 1, 2016

I think that we have a decision to make here: either we decide that editor support is the land of external tools and remove emacs/ altogether, donating it to the Tuareg project (I guess), or we decide to preserve the existing feature level of caml-types.el and investigate minimal ways to do this without .annot files for practical reason.

The main downside of the second choice is that it encourages people to pour effort into sub-par solutions instead of strengthening existing tools. For example, the work that @nojb did here could arguably have been an excellent pull-request to improve ocp-index's show-type mode.

The main downside of the first choice is that evidently some users are content with just caml-types.el and have not been able to directly replace it with either Merlin or ocp-index:

  • because of Windows support (cited by @nojb on Merlin; what about ocp-index?)
  • because of the more difficult interaction of these tools and opam switches (cited by @Chris00)
  • or because they are still inferior in support of some language features (querying the type of a method, cited by Boris in PR#4369)

I have tried the "it's the better long-term choice to discourage people to work on X" myself (#241), and I notice that it does not really work. I would personally be of the opinion of letting those who do the work decide. If @nojb took the time to develop a nice solution for caml-types.el, we could respect it and include it for now -- this does not preclude deciding that emacs/ should live outside the distribution at a later point in the future. If others want to convince him and other caml-types.el users than there are better alternatives, I guess the onus is on them to solve the perceived problems with these better alternatives, possibly reusing @nojb's work whenever relevant.

Edit: maybe the proposed solution has a too-high risk of regression, is too hard to maintain, is not the right technical choice or does not actually solve the problem (because of opam switches, for example). If so, then say it! That is a different kind of objection.

@dbuenzli
Copy link
Copy Markdown
Contributor

dbuenzli commented May 1, 2016

Le dimanche, 1 mai 2016 à 23:19, Nicolas Ojeda Bar a écrit :

But then the status of the emacs directory needs to be clarified: does it really make sense to keep these files in the distribution if they are not being updated to take advantage of features introduced 2 or 3 compiler versions ago ?

I think it should be forked into its own project in the OCaml organisation. It's again a component that could benefit of not being tied to OCaml's own release cycle. Note that its installation already occurs through a separate package in OPAM.

Daniel

@lefessan
Copy link
Copy Markdown
Contributor

lefessan commented May 2, 2016

@gasche

I think that we have a decision to make here: either we decide that editor support is the land of external tools and remove emacs/ altogether, donating it to the Tuareg project (I guess), or we decide to preserve the existing feature level of caml-types.el and investigate minimal ways to do this without .annot files for practical reason.

There is a third solution: keep the current mode as it is, use .annot files for small projects, and when projects grow bigger, people should just improve their environment to cope with the growth, i.e. start using ocp-index or merlin. With the help of OPAM, it would be easy to add an extended-caml-types.el package that would use them with the same bindings as caml-types.el.

The main downside of the second choice is that it encourages people to pour effort into sub-par solutions instead of strengthening existing tools. For example, the work that @nojb did here could arguably have been an excellent pull-request to improve ocp-index's show-type mode.

As far as I know, ocp-index already has a show-type mode, so maybe it would be more productive indeed to understand its limitations and fix them, and to bind it an emacs mode similar to caml-types.el.

because of Windows support (cited by @nojb on Merlin; what about ocp-index?)

As far as I know, both tools are available under Windows. Independently of this PR, it would be interesting to know which limitations were observed under Windows (there is no issue related to Windows on ocp-index BTR)

because of the more difficult interaction of these tools and opam switches (cited by @Chris00)

Then, we should include either ocp-index or Merlin in the distribution, that would solve these issues, since they are related to the fact that the tools are not always installed in all switches.

or because they are still inferior in support of some language features (querying the type of a method, cited by Boris in PR#4369)

If cmt_lookup can do it, I don't see why it can't be done in existing tools.

I have tried the "it's the better long-term choice to discourage people to work on X" myself (#241), and I notice that it does not really work. I would personally be of the opinion of letting those who do the work decide.

I don't understand your argument. You mean that, because I always use ocp-build to compile my ocaml programs, I should be allowed to add it to the OCaml distribution, because I have done the work and it makes it easier for me to use it ?

If @nojb took the time to develop a nice solution for caml-types.el, we could respect it and include it for now

Are you accusing me of disrespecting the work of @nojb ? So now, when somebody disagrees with merging a PR, it's a lack of respect for the work done ? Please, could we keep the debate technical ???

Edit: maybe the proposed solution has a too-high risk of regression, is too hard to maintain, is not the right technical choice or does not actually solve the problem (because of opam switches, for example). If so, then say it! That is a different kind of objection.

That's exactly what I said: if you include this tool in the distribution, it will either start growing with new features to become Merlin, and we will have to move it outside of the distribution, or it will be kept limited, and users are going to complain that it is missing all the nice features you could also do when accessing .cmt files. Either way, instead of having two projects to lookup types, we will have a third one, and it will be harder for newbies to make their choice.

@damiendoligez
Copy link
Copy Markdown
Member

  1. I'm definitely planning to push the emacs stuff out of the distribution and into a separate project.
  2. I have a pet project of executing OCaml code within emacs (currently with a byte-code interpreter written in emacs-lisp) that would provide another solution. It is in "almost working" state.

@yakobowski
Copy link
Copy Markdown
Contributor

The "IDE mode" front has been pretty bleak in the past years, with a split between the Caml mode and Tuareg. We now have extremely efficient replacement for on-the-fly type queries and indentation with Merlin and ocp-indent. On real-sized projects, the productivity boost obtained with Merlin -- in particular with go-to-definition -- is remarkable. I quite agree with @lefessan on the fact that awakening caml-types.el from its torpor would likely only lead to a half-baked re-implementation of Merlin.

@nojb, you mentioned on Mantis that Merlin has many dependencies and is hard to install on Windows. Do you have more details on which dependency is problematic? My gut feeling is that we should strive to make the installation of Merlin painless everywhere.

@let-def
Copy link
Copy Markdown
Contributor

let-def commented May 5, 2016

I have the script needed to build Merlin on windows and the final binary is relocatable.
This should fit regular windows users.

Hardest part is installation of dependencies (yojson).

@ghost
Copy link
Copy Markdown
Author

ghost commented May 5, 2016

Indeed! It is not so much that it is not possible to install merlin's dependencies on Windows but that it has many dependencies and there is no opam to help. I don't remember the details now but I remember trying to install yojson's dependencies (cppo, easy-format, biniou) inside Cygwin and not succeeding right away.

@bobzhang
Copy link
Copy Markdown
Member

bobzhang commented May 5, 2016

@let-def if it is pure ocaml, I find one file distribution is really cool, I have a story of packing whole dependencies into a single file (around 100K loc), and compile, it just works, the compilation time is around 20s (native build).
I even managed to bundle the whole ocaml compiler in a single file.
For fundamental tools like merlin, easy distribution is very helpful (we don't use opam in production, yet )

@ghost
Copy link
Copy Markdown
Author

ghost commented May 13, 2016

Closing this, as the discussion seems to have bottomed-out.

@ghost ghost closed this May 13, 2016
stedolan added a commit to stedolan/ocaml that referenced this pull request Mar 22, 2022
stedolan added a commit to stedolan/ocaml that referenced this pull request May 24, 2022
64235a3 flambda-backend: Change Float.nan from sNaN to qNaN (ocaml#466)
14a8e27 flambda-backend: Track GC work for all managed bigarray allocations (upstream 11022) (ocaml#569)
c3cda96 flambda-backend: Add two new methods to targetint for dwarf (ocaml#560)
e6f1fed flambda-backend: Handle arithmetic overflow in select_addr (ocaml#570)
dab7209 flambda-backend: Add Target_system to ocaml/utils (ocaml#542)
82d5044 flambda-backend: Enhance numbers.ml with more primitive types (ocaml#544)
216be99 flambda-backend: Fix flambda_o3 and flambda_oclassic attributes (ocaml#536)
4b56e07 flambda-backend: Test naked pointer root handling (ocaml#550)
40d69ce flambda-backend: Stop local function optimisation from moving code into function bodies; opaque_identity fixes for class compilation (ocaml#537)
f08ae58 flambda-backend: Implemented inlining history and use it inside inlining reports (ocaml#365)
ac496bf flambda-backend: Disable the local keyword in typing (ocaml#540)
7d46712 flambda-backend: Bugfix for Typedtree generation of arrow types (ocaml#539)
61a7b47 flambda-backend: Insert missing page table check in roots_nat.c (ocaml#541)
323bd36 flambda-backend: Compiler error when -disable-all-extensions and -extension are used (ocaml#534)
d8956b0 flambda-backend: Persistent environment and reproducibility (ocaml#533)
4a0c89f flambda-backend: Revert "Revert bswap PRs (480 and 482)" (ocaml#506)
7803705 flambda-backend: Cause a C warning when CAMLreturn is missing in C stubs. (ocaml#376)
6199db5 flambda-backend: Improve unboxing during cmm for Flambda (ocaml#295)
96b9e1b flambda-backend: Print diagnostics at runtime for Invalid (ocaml#530)
42ab88e flambda-backend: Disable bytecode compilers in ocamltest (ocaml#504)
58c72d5 flambda-backend: Backport ocaml#10595 from upstream/trunk (ocaml#471)
1010539 flambda-backend: Use C++ name mangling convention (ocaml#483)
81881bb flambda-backend: Local allocation test no longer relies on lifting (ocaml#525)
f5c4719 flambda-backend: Fix an assertion in Closure that breaks probes (ocaml#505)
c2cf2b2 flambda-backend: Add some missing command line arguments to ocamlnat (ocaml#499)

git-subtree-dir: ocaml
git-subtree-split: 64235a3
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Co-authored-by: tmattio <tmattio@users.noreply.github.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.