Simplified compaction without page table#9728
Conversation
xavierleroy
left a comment
There was a problem hiding this comment.
What a pleasure to see the fabled page-less compactor!
I still don't understand the algorithm, but was not expecting to. Below are stylistic suggestions and one question.
runtime/compact.c
Outdated
| case Caml_white: | ||
| CAMLassert (Is_in_heap (q)); | ||
| if (Tag_hd (h) == Infix_tag){ | ||
| if (Is_black_val ((value) q - Infix_offset_val (q))) break; |
There was a problem hiding this comment.
If q is an infix pointer in the major heap, the enclosing closure block is in the major heap too, so how can the enclosing block be black?
On the other hand, if you suspect q may point within a closure allocated outside the heap, the test is OK but the assertion on line 74 will be false.
There was a problem hiding this comment.
Indeed, I'll update the assertion on line 74. The idea is that it's easy to check the real header. The alternative would be to mandate that every infix header outside the heap be black, but that would be more work for things like ancient or even flambda.
There was a problem hiding this comment.
In any case, white-colored infix headers out of the heap should be forbidden, because this will produce out-of-heap pointers to non-black values, which is breaking other pieces of code (e.g., weak.c).
There was a problem hiding this comment.
The conclusion of #9735 seems to be that the compactor would be the only one to rely on this invariant, so I'd rather take the less-demanding option.
xavierleroy
left a comment
There was a problem hiding this comment.
A couple of weird types.
e70b6c5 to
7cab019
Compare
jhjourdan
left a comment
There was a problem hiding this comment.
The approach looks good to me, and the code is indeed simpler.
I left a few remarks: perhaps the most important is that the new compactor should not rely on the no-naked-pointer mode and should therefore read the page table when necessary in naked pointers mode.
runtime/compact.c
Outdated
| case Caml_white: | ||
| CAMLassert (Is_in_heap (q)); | ||
| if (Tag_hd (h) == Infix_tag){ | ||
| if (Is_black_val ((value) q - Infix_offset_val (q))) break; |
There was a problem hiding this comment.
In any case, white-colored infix headers out of the heap should be forbidden, because this will produce out-of-heap pointers to non-black values, which is breaking other pieces of code (e.g., weak.c).
runtime/compact.c
Outdated
| case Caml_white: | ||
| CAMLassert (Is_in_heap (q)); | ||
| if (Tag_hd (h) == Infix_tag){ | ||
| if (Is_black_val ((value) q - Infix_offset_val (q))) break; |
| Link infix headers in each block in an inverted list of inverted lists. | ||
| Don't forget roots and weak pointers. */ | ||
| Don't forget roots and weak pointers. | ||
| This is a mark-like pass. */ |
There was a problem hiding this comment.
What do you mean by this?
To me, since this is a pass where we traverse each object in order from the first to the last in the heap, this seems very much sweep-like.
There was a problem hiding this comment.
I mean it enumerates every pointer to every block, including the roots.
runtime/compact.c
Outdated
|
|
||
| void caml_invert_root (value v, value *p) | ||
| { | ||
| CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p)); |
There was a problem hiding this comment.
This assertion is wrong in naked pointers mode. In no-naked-pointers mode, it is useless since it uses Is_in_heap, which will be defined to 1.
There was a problem hiding this comment.
That said, note that this assertion is failing in CI currently. This indicates that even in no-naked-pointers mode, some root contains a naked pointer (or is inverted several times?), or something else fishy.
There was a problem hiding this comment.
| CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p)); | |
| CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p) || | |
| Tag_val(*p) == Infix_tag); |
Actually, these assertion failures are simply related to white infix headers (see #9735). The change above fixes the issue.
There was a problem hiding this comment.
I dislike those complicated assertions. You're trying to check that *p is an OK OCaml value, but if it isn't the GC would have bombed already.
There was a problem hiding this comment.
These complicated assertions are usually the by-product of debugging sessions.
runtime/compact.c
Outdated
| *p = q; | ||
|
|
||
| if (t == Closure_tag){ | ||
| /* Revert the infix pointers. */ |
There was a problem hiding this comment.
| /* Revert the infix pointers. */ | |
| /* Revert the pointers to infix blocks. */ |
There was a problem hiding this comment.
There is no such thing as an infix block... There are infix pointers inside blocks, though.
|
Another TODO: finally get rid of the page table in no-noaked-pointers mode, since we don't use it any more. But I guess you want to do that in a separate PR. |
xavierleroy
left a comment
There was a problem hiding this comment.
This is starting to look very good, but please make sure that the code works in naked-pointers mode as well as in no-naked-pointers mode. We have to support both modes until OCaml 5.00 at least. And the new code should still be more efficient than the old code even in naked-pointers mode, since it has 3 passes instead of 4.
runtime/compact.c
Outdated
|
|
||
| void caml_invert_root (value v, value *p) | ||
| { | ||
| CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p)); |
There was a problem hiding this comment.
I dislike those complicated assertions. You're trying to check that *p is an OK OCaml value, but if it isn't the GC would have bombed already.
runtime/compact.c
Outdated
| *p = q; | ||
|
|
||
| if (t == Closure_tag){ | ||
| /* Revert the infix pointers. */ |
There was a problem hiding this comment.
There is no such thing as an infix block... There are infix pointers inside blocks, though.
Yes, that would be a separate PR. Also, remember that we still need the page table in debug mode because of all these |
Ah, I did not understand that you wanted to keep the page table for the debug mode. Then perhaps we also want to assert in debug mode that it is used rightfully in |
…mpaction algorithm and remove the use of the page table
ee57ca2 to
7f1f2da
Compare
Co-authored-by: Jacques-Henri Jourdan <jourgun@gmail.com>
Co-authored-by: Jacques-Henri Jourdan <jourgun@gmail.com>
xavierleroy
left a comment
There was a problem hiding this comment.
The code looks very good to me. However, the Travis CI is still failing hard. For the record: it tests naked-pointers mode exclusively; we still need to add no-naked-pointers mode to the Travis configurations.
jhjourdan
left a comment
There was a problem hiding this comment.
The code looks good to me. The CI failures are due to the assertion in caml_invert_root, which should either be completely removed or guarded by an #ifdef NO_NAKED_POINTERS.
| CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p) | ||
| || Tag_val (*p) == Infix_tag); |
There was a problem hiding this comment.
This assertion is wrong in naked pointers mode when *p is a naked pointer (e.g., a code pointer in bytecode mode, which explain the CI failure).
|
I take good note that the weird assertion is to be removed sooner rather than later. The code is fine, let's squash and merge! |
This is an alternative to #9704. Here we take advantage of the self-describing closure representation to greatly simplify the compactor, which frees one bit in each pointer, that we use to get rid of the page table. The only drawback is that we now have to encode/decode pointers with some bit-twiddling on each access.
The new compactor gets rid of one pass (out of four) and the performance seems to be slightly better than the alternatives on @xavierleroy's synthetic benchmark.
I ran the benchmark on our big server and the results looked noisy, so I ran them each several times and report min/avg/max for each.