Conversation
|
Naive question: to perform generalization, we only need to traverse the type variables that occur in the result type. Here, if I understand correctly, we generalize all type variables at the level we are exiting, including those that were only used internally for type-checking and do not occur in the result. This changes the algorithmic complexity of inference, right? In the past I proposed that the For example, in trunk we have let cty, ty, force, cty', ty', force' =
with_local_level_iter ~post:generalize_structure begin fun () ->
let (cty, ty, force) =
Typetexp.transl_simple_type_delayed env sty
and (cty', ty', force') =
Typetexp.transl_simple_type_delayed env sty'
in
((cty, ty, force, cty', ty', force'),
[ ty; ty' ])
endwith your PR we have the nicer let cty, ty, force, cty', ty', force' =
with_local_level_generalize_structure begin fun () ->
let (cty, ty, force) =
Typetexp.transl_simple_type_delayed env sty
and (cty', ty', force') =
Typetexp.transl_simple_type_delayed env sty'
in
(cty, ty, force, cty', ty', force')
endbut a middle-road would be let cty, ty, force, cty', ty', force' =
with_local_level_iter begin fun gen ->
let (cty, ty, force) =
Typetexp.transl_simple_type_delayed env sty
and (cty', ty', force') =
Typetexp.transl_simple_type_delayed env sty'
in
(cty, gen.structure ty, force,
cty', gen.structure ty', force')
end |
|
Indeed, it looks like we do useless work when we generalize nodes that will just be garbage collected. |
ff85d78 to
a360fa1
Compare
|
Performance measurements on ocamldoc, using ocamlc.opt and ocamlopt.opt. Note that to test with ocamlop.opt one has to add the following line to |
|
The PR is now ready for review. |
edd8f0b to
ff52647
Compare
|
I have a couple of concerns about this PR. The first is around safety: generlizing type nodes is not the conservative option. When leaving a scope for an expansive expression, this PR means that a failure to call The second is around performance. The benchmarks here are pretty small and already show some cost. I'd prefer to see something a little bigger and more varied tested if possible. Have you tried profiling to see if any of the new code is particularly hot? Have you tried using a hashtable instead of a map in a ref? Or even just a dynamic array? |
|
@lpw25 For performance, we could do more benchmarks, but I'm no specialist. I did not try profiling. It is clear that the hot spots are registering fresh nodes, and traversing them for generalization. The cost of handling generic nodes in It might be worth trying with a hash table, however the current code exploits the persistence of maps, so this requires some changes in the code to undo changes. |
|
I did try using a hash table (branch The results are not bad: I.e., this is very close to trunk. Could switch to this if the difference matters. I also tried a hash table specialized to integers, but there is not improvement in performance. |
|
I will plan to review this next week. |
e5e3156 to
4bc04ab
Compare
|
Removed |
|
I also checked that remove Those are the ones exercising the type checker. |
goldfirere
left a comment
There was a problem hiding this comment.
Read through this at last. I think I would want to understand this better before Approving. Ideally comments would lead the way, but it may also be helpful to schedule a call to review this together.
|
I've reviewed @garrigue's two latest commits about updating btype and Pexp_apply. |
|
@goldfirere Thank for your view review. I am still updating the code, as you helped discover a number of possible improvements, and hope to finish by next week. Then we can schedule a call. |
garrigue
left a comment
There was a problem hiding this comment.
@goldfirere I think I answered all your questions.
I have pushed the updated code.
If you have questions we can have a discussion.
As usual 9pm in Japan would be OK. Though we have already a meeting programmed on the 20th.
|
Just one last little comment request, and then we can merge! :) |
|
Thank you for your approval. However, rebasing causes an error in the testsuite, and we are still inverstigating the cause. (There is an easy workaround, but I would really like to understand what is happening.) |
use with_local_level_generalize in typecore allow lowering generic_level nodes do more automatic generalization fix add_to_pool clear simple_abbrevs in with_local_level_generalize - rename [?post] to [?before_generalize] - use [with_local_level_generalize*] for easy cases in typecore.ml use with_local_level_generalize_* for easy parts of typeclass.ml fix copy_spine use before_generalize use automatic generalize for univars in Typecore.type_label_exp make almost all generalization automatic in typecore missing generalization in type_let no manual generalize in typecore Typeclass.type_classes automatic generalization in typedecl remove remaining uses of generalize add Changes
…-modules-bugs/pr7601_ok.ml)
ff8cfe9 to
b7aba09
Compare
|
@goldfirere Ironically, it seems that the cause of the failure is #13088, your PR about making It would be nice if you could review the last commit, which is the only new one. It also uses We again did benchmark compiling |
| let rec dummy = {level = max_int; pool = []; next = dummy} | ||
| let pool_stack = s_table (fun () -> {level = 0; pool = []; next = dummy}) () | ||
|
|
||
| let rec pool_of_level level pool = |
There was a problem hiding this comment.
It would be nice to highlight that this algorithm is linear, but that you believe amortization still suggests this will be efficient. I further think this might be more efficient than it appears because almost all the time, the topmost pool is the right one (which was also found quickly in the old code)... and higher pools are more likely the answer every step of the way.
If we wanted to keep this structure but found the linear-time access to be troublesome, we could easily switch to using binary search. This would require storing the pools in an array, but preallocating, say, 100 pools is probably fine. I don't think this change is warranted, but it might be nice to leave a comment saying that we could switch to a more efficient implementation if called for.
This PR makes the generalization automatic when leaving a scope.
Namely, we introduce a notion of type node pool in
Btype.Btype.add_to_pool.Ctype.with_local_level_generalize[_structure]create a new pool.** When leaving its scope, all the nodes in the pool which have level higher than then previous level are generalized
** In the
_structurecase, type variables rather have their level lowered.** Nodes with a level lower or equal to the previous level are moved to the pool corresponding to their actual level.
** Nodes that are already at
generic_levelandTlink's are ignored.** Calls to
lower_contravariantmust be added to thebefore_generalizeargument.generic_levelare not tracked, we need to add them back to a pool if their level is lowered (inupdate_level).We could get rid of all calls to
generalizeandgeneralize_structure, butlimited_generalizeandgeneralize_spineare still needed for classes.Checked that there is no negative impact on compiler performance on
ocamldoc.Written with @t6s