Conversation
|
It might be a good idea to have a log message in case some users get a slowdown. |
The cached function is indeed no longer used to implement |
|
Empirical observation shows that there are some cache hits. I'm not opposed to the change, but to gain confidence, it could be interesting to try building an ad hoc counterexample with many cache hits, to see a "worst case" slowdown if we remove the cache. Depending on the outcome, one could decide to use a better implementation for the cache. Btw, do you believe the slowdown in your case with the cache comes from the linear scan ( |
Indeed, roughly 1% of the calls are cache hits.
The linear scan is of course not great, but it doesn't seem to be a problem often. I applied this patch to the compiler: diff --git a/typing/env.ml b/typing/env.ml
index f49a84780..d7835c3b2 100644
--- a/typing/env.ml
+++ b/typing/env.ml
@@ -1602,13 +1602,18 @@ let prefix_idents root sub sg =
Hashtbl.add prefixed_sg root sgs;
sgs
in
+ let len = List.length !sgs in
try
- List.assq sg !sgs
+ let r = List.assq sg !sgs in
+ Printf.eprintf "CACHE HIT %d\n%!" len;
+ r
with Not_found ->
+ Printf.eprintf "CACHE MISS %d\n%!" len;
let r = prefix_idents root 0 sub sg in
sgs := (sg, r) :: !sgs;
r
else
+ let () = Printf.eprintf "CACHE MISS (not identity)\n%!" in
prefix_idents root 0 sub sgWhich when building the compiler itself ( And when building the same part of janestreet's codebase as before gives me: |
I don't know... The reason why the cache was added doesn't exist anymore. We don't have any other example of code that would benefit from it in practice. Building some degenerate piece of code which will greatly benefit from the cache will only show that that piece of code greatly benefits from the cache. What we currently have in this PR is evidence that the cache is
Both of which have the advantage of existing, and being somewhat representative of "typical" ocaml code. |
Well, if the slowdown of removing the cache is not dramatic even for a specially crafted example, we can blindly remove the cache. If not, this can give an intuition on the affected code style; and it is worth waiting a bit for people to benchmark compilation time on their own code base. We'll do that on ours and share the results here. |
|
@trefis: can you remove other occurrences of It seems that the cached function is O(n log n) in the number of items in the signature. The only use of Did I miss something? |
Indeed, I had forgotten. I pushed a commit doing just that. |
@alainfrisch: have you had an opportunity to run that benchmark yet? :) (sorry to be pushy, I agree that this PR is not the most important, but that's also why I'm afraid it'd be forgotten if not pinged) |
We did one benchmark that showed a (small) slowdown when the cache was present (sorry I can't find the exact numbers at the moment), similar to what is described in the PR description. |
|
I think this benchmark is enough to convince us that at least for our code base, the impact of removing the cache is at worst neutral and probably slightly positive, so for us, this is good to go. |
Removes a cache around
Env.prefix_identsthat has been made obsolete by #834 .Cf. MPR#5877 for the history.
For the record, building a non-negligible subset of janestreet's codebase with 4.07.1 takes 16m55s (±5s), building the same subset with 4.07.1 with this cache removed takes 15m05s (±5s).
I don't know if this deserves a changelog entry.