Skip to content

[occurrences] Store modules' shapes in env sumaries#11983

Closed
voodoos wants to merge 1 commit intoocaml:trunkfrom
voodoos:store-shapes-in-env-summaries
Closed

[occurrences] Store modules' shapes in env sumaries#11983
voodoos wants to merge 1 commit intoocaml:trunkfrom
voodoos:store-shapes-in-env-summaries

Conversation

@voodoos
Copy link
Copy Markdown
Contributor

@voodoos voodoos commented Jan 31, 2023

When the shapes were introduced into the compiler it was primarily with the goal of supporting jump-to-definition. For this purpose, visible shapes of compilation units where added to the cmt files and Merlin use these when reducing a value's shape to find its definition.

However, when indexing the code-base to provide project-wide occurrences having only the shape of the top-level of the compilation unit is not sufficient: one have to iterate over values in the cmt's Typedtree and reduce their shapes and that reduction require access to intermediate modules' shapes.

However, naively adding the shapes to the summaries impact the cmt filesize a lot. On Irmin, with this PR, it goes from 190M to 522M. This is the reason why I opened this PR as a draft, so that we can discuss options.

  1. I was expecting more sharing to take place. Maybe there is a way to optimize for it ? (ping @trefis) Or maybe we simply add a lot of otherwise discarded information.
  2. Maybe we could cut some branches down an only store the shapes of modules which are eventually exposed by the compilation unit ?
  3. We discuss with @lpw25 the possibility of having the compiler store the CU's index itself: a table from partially reduced shapes to locations. Doing that should remove the need from having the shapes of intermediate modules.

For now I have little hopes for 1, and will have a try at 2 since it sounds less invasive than 3. Any comments / ideas are welcome :-)

@voodoos
Copy link
Copy Markdown
Contributor Author

voodoos commented Feb 1, 2023

A fourth option would be to add a new flag -shapes-in-summaries and only add shapes to cmt's if its present.
This is not a very satisfactory solution but it could be an easy first step while waiting for a more involved solution

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Feb 1, 2023

I think that it would be worth prototyping 3 if you can. I suspect it is the best approach, but we really need to see some performance and memory size numbers to be sure.

@voodoos
Copy link
Copy Markdown
Contributor Author

voodoos commented Mar 28, 2023

Closing in favor of #12142

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants