Skip to content

Simplified compaction without page table#9728

Merged
xavierleroy merged 7 commits intoocaml:trunkfrom
damiendoligez:nnp-compact
Jul 13, 2020
Merged

Simplified compaction without page table#9728
xavierleroy merged 7 commits intoocaml:trunkfrom
damiendoligez:nnp-compact

Conversation

@damiendoligez
Copy link
Copy Markdown
Member

This is an alternative to #9704. Here we take advantage of the self-describing closure representation to greatly simplify the compactor, which frees one bit in each pointer, that we use to get rid of the page table. The only drawback is that we now have to encode/decode pointers with some bit-twiddling on each access.

The new compactor gets rid of one pass (out of four) and the performance seems to be slightly better than the alternatives on @xavierleroy's synthetic benchmark.

List length #runs # heap chunks time w/ page table (s) time w/ binary search (s) time w/ simplified compactor (s)
10_000_000 80 42 0.54 / 0.58 / 0.87 0.55 / 0.58 / 0.89 0.51 / 0.53 / 0.80
50_000_000 16 53 2.31 / 2.40 / 2.88 2.35 / 2.39 / 2.67 2.14 / 2.24 / 2.49
100_000_000 8 58 4.79 / 4.82 / 4.87 4.87 / 5.23 / 6.77 4.45 / 4.48 / 4.52
200_000_000 4 63 9.80 / 9.85 / 9.92 10.02 / 10.05 / 10.07 9.11 / 9.20 / 9.26
400_000_000 2 68 20.47 / 20.55 / 20.64 20.84 / 20.92 / 21.00 19.48 / 19.48 / 19.48

I ran the benchmark on our big server and the results looked noisy, so I ran them each several times and report min/avg/max for each.

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a pleasure to see the fabled page-less compactor!

I still don't understand the algorithm, but was not expecting to. Below are stylistic suggestions and one question.

case Caml_white:
CAMLassert (Is_in_heap (q));
if (Tag_hd (h) == Infix_tag){
if (Is_black_val ((value) q - Infix_offset_val (q))) break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If q is an infix pointer in the major heap, the enclosing closure block is in the major heap too, so how can the enclosing block be black?

On the other hand, if you suspect q may point within a closure allocated outside the heap, the test is OK but the assertion on line 74 will be false.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I'll update the assertion on line 74. The idea is that it's easy to check the real header. The alternative would be to mandate that every infix header outside the heap be black, but that would be more work for things like ancient or even flambda.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, white-colored infix headers out of the heap should be forbidden, because this will produce out-of-heap pointers to non-black values, which is breaking other pieces of code (e.g., weak.c).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cf. #9735.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conclusion of #9735 seems to be that the compactor would be the only one to rely on this invariant, so I'd rather take the less-demanding option.

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of weird types.

Copy link
Copy Markdown
Contributor

@jhjourdan jhjourdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks good to me, and the code is indeed simpler.

I left a few remarks: perhaps the most important is that the new compactor should not rely on the no-naked-pointer mode and should therefore read the page table when necessary in naked pointers mode.

case Caml_white:
CAMLassert (Is_in_heap (q));
if (Tag_hd (h) == Infix_tag){
if (Is_black_val ((value) q - Infix_offset_val (q))) break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, white-colored infix headers out of the heap should be forbidden, because this will produce out-of-heap pointers to non-black values, which is breaking other pieces of code (e.g., weak.c).

case Caml_white:
CAMLassert (Is_in_heap (q));
if (Tag_hd (h) == Infix_tag){
if (Is_black_val ((value) q - Infix_offset_val (q))) break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cf. #9735.

Link infix headers in each block in an inverted list of inverted lists.
Don't forget roots and weak pointers. */
Don't forget roots and weak pointers.
This is a mark-like pass. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by this?

To me, since this is a pass where we traverse each object in order from the first to the last in the heap, this seems very much sweep-like.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it enumerates every pointer to every block, including the roots.


void caml_invert_root (value v, value *p)
{
CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p));
Copy link
Copy Markdown
Contributor

@jhjourdan jhjourdan Jul 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is wrong in naked pointers mode. In no-naked-pointers mode, it is useless since it uses Is_in_heap, which will be defined to 1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, note that this assertion is failing in CI currently. This indicates that even in no-naked-pointers mode, some root contains a naked pointer (or is inverted several times?), or something else fishy.

Copy link
Copy Markdown
Contributor

@jhjourdan jhjourdan Jul 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p));
CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p) ||
Tag_val(*p) == Infix_tag);

Actually, these assertion failures are simply related to white infix headers (see #9735). The change above fixes the issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike those complicated assertions. You're trying to check that *p is an OK OCaml value, but if it isn't the GC would have bombed already.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These complicated assertions are usually the by-product of debugging sessions.

*p = q;

if (t == Closure_tag){
/* Revert the infix pointers. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Revert the infix pointers. */
/* Revert the pointers to infix blocks. */

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such thing as an infix block... There are infix pointers inside blocks, though.

@jhjourdan
Copy link
Copy Markdown
Contributor

Another TODO: finally get rid of the page table in no-noaked-pointers mode, since we don't use it any more.

But I guess you want to do that in a separate PR.

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to look very good, but please make sure that the code works in naked-pointers mode as well as in no-naked-pointers mode. We have to support both modes until OCaml 5.00 at least. And the new code should still be more efficient than the old code even in naked-pointers mode, since it has 3 passes instead of 4.


void caml_invert_root (value v, value *p)
{
CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike those complicated assertions. You're trying to check that *p is an OK OCaml value, but if it isn't the GC would have bombed already.

*p = q;

if (t == Closure_tag){
/* Revert the infix pointers. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such thing as an infix block... There are infix pointers inside blocks, though.

@xavierleroy
Copy link
Copy Markdown
Contributor

Another TODO: finally get rid of the page table in no-noaked-pointers mode, since we don't use it any more. But I guess you want to do that in a separate PR.

Yes, that would be a separate PR. Also, remember that we still need the page table in debug mode because of all these Is_in_heap assertions that litter the runtime system.

@jhjourdan
Copy link
Copy Markdown
Contributor

Also, remember that we still need the page table in debug mode because of all these Is_in_heap assertions that litter the runtime system.

Ah, I did not understand that you wanted to keep the page table for the debug mode. Then perhaps we also want to assert in debug mode that it is used rightfully in Is_in_value_area.

damiendoligez and others added 3 commits July 6, 2020 13:58
Co-authored-by: Jacques-Henri Jourdan <jourgun@gmail.com>
Co-authored-by: Jacques-Henri Jourdan <jourgun@gmail.com>
Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks very good to me. However, the Travis CI is still failing hard. For the record: it tests naked-pointers mode exclusively; we still need to add no-naked-pointers mode to the Travis configurations.

Copy link
Copy Markdown
Contributor

@jhjourdan jhjourdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. The CI failures are due to the assertion in caml_invert_root, which should either be completely removed or guarded by an #ifdef NO_NAKED_POINTERS.

Comment on lines +98 to +99
CAMLassert (Is_long (*p) || Is_in_heap (*p) || Is_black_val (*p)
|| Tag_val (*p) == Infix_tag);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is wrong in naked pointers mode when *p is a naked pointer (e.g., a code pointer in bytecode mode, which explain the CI failure).

@xavierleroy
Copy link
Copy Markdown
Contributor

I take good note that the weird assertion is to be removed sooner rather than later. The code is fine, let's squash and merge!

@xavierleroy xavierleroy merged commit 601819f into ocaml:trunk Jul 13, 2020
@damiendoligez damiendoligez deleted the nnp-compact branch July 15, 2020 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants