Speed up linking by shrinking symbol and relocation tables#1222
Speed up linking by shrinking symbol and relocation tables#1222mshinwell merged 10 commits intooxcaml:mainfrom
Conversation
|
Is there a reason not to use the following type definition ? |
Not particularly? I don't have a very strong opinion here: the record is slightly closer to the truth (the symbol name is independent of its global status) but the variant has more lightweight syntax. |
|
I'm reviewing this as well as working on a version with flambda2 that would try and make local every symbol non-reachable from the cmx. |
Gbury
left a comment
There was a problem hiding this comment.
This is good from my point of review, but note that I did not review the cfg part, and that I skimmed over the the to_cmm part, since I rewrote most of it in #1246 which I expect to be merged very soon after this one.
Ideally, in the future, I'd very much like to do some more cleanup around symbols: this PR has small changes to make the local vs global symbols work, but this also means that there remains a lot of code handling symbols as strings. This makes it very easy to introduce bugs wrt locality (I have had to debug a few while writing #1246 ), so I think that after this one and #1246 , we should probably try and write another PR to use symbols instead of strings everywhere it makes sense (I'll probably do it). We could then consider all this work for upstreaming.
|
I looked at @stedolan 's fixes for @gretay-js 's comments and they look fine. Merging. |
Linking currently takes a long time because many symbol references (in particular, to statically allocated data) are not resolved until link time, despite many of these references being to data defined in the same file.
So far, this patch removes about half the symbols in compiler-libs, mostly unexported data. (The benefit applies only to Closure builds at the moment).
The major change is to use the following type for symbols in Cmm onwards, instead of
string:At assembly time,
Localsymbols are generated as assembler labels rather than ELF symbols, so references can be resolved eagerly.Globalsymbols are generated as both an ELF symbol and an assembler label, so thatLocalreferences within the same file can use the label form.There is quite a lot of low-hanging fruit here. The last commit here removes another ~8% of symbol references by wrapping
caml_call_gconce per file. We should also be able to turn references to functions in the same file into assembler label references, and to improve the encoding of debug info and frame tables to generate fewer relocations.