Naked pointers and the bytecode interpreter, alternative approach by xavierleroy · Pull Request #9687 · ocaml/ocaml

xavierleroy · 2020-06-16T08:52:29Z

This is an alternative to #9680 (please see for context) where code pointers in the stack of the bytecode interpreter follow approach 1a (encapsulation by setting the low bit to 1) instead of 2b (special tests at scanning time).

Rather than storing a pointer to the previous frame in the Trap_link field of the current frame, store the distance (pointer difference) between the current frame and the previous frame, tagged as an OCaml integer. Using a tagged integer instead of a raw pointer means fever problems later with strict no-naked-pointer support. Using a distance rather than an absolute address simplifies the code that resizes the stack.

… of aligned naked pointers In "naked pointers" mode, these are no-operations. In "no naked pointers" mode, the pointer (assumed 2-aligned) is disguised as an OCaml integer by setting its low bit to 1.

Use the new Val_foreign_ptr and Ptr_foreign_val conversions, which are no-ops in "naked pointers" mode, but tag the code pointers like OCaml integers in "no naked pointers" mode.

Earlier, nonaked-pointer mode was effective only in native code.

…ge table `!Is_in_value_area(pc)` is always false if we turn the page table off. A better check would be `caml_find_code_fragment_by_pc(pc) != NULL`, but I feel this is too costly even for the debug mode of the interpreter.

jhjourdan

As far as I understand, this version may produce incorrect callstacks (see comments on the code) an should therefore be avoided.

jhjourdan · 2020-06-16T11:47:21Z

runtime/caml/mlvalues.h

+/* Pointers outside the OCaml heap that are used as values.
+   (E.g. code pointers.)  Must be 2-aligned. */
+
+Caml_inline value Val_foreign_ptr(void * p)


Is there a reason for using a different implementation in naked pointer mode and without this mode?

I don't think there should be compatibility problems: there should not be any extern libraries which directly reads entries in the bytecode interpreter stack.

I thought it would be useful for performance comparison purposes to preserve the current implementation (using unmodified naked pointers) in "naked pointers" mode. Eventually, we would keep only the "+1 / -1" implementation.

jhjourdan · 2020-06-16T13:30:02Z

runtime/backtrace_byt.c

+    /* Code pointers in the stack are tagged 1 in NO_NAKED_POINTER mode
+       and 0 otherwise */
+#ifdef NO_NAKED_POINTERS
+    if (! Is_long(*sp)) continue;
+#else
    if (Is_long(*sp)) continue;
-    p = (code_t) *sp;
+#endif
+    p = (code_t) Foreign_ptr_val(*sp);
    if (Caml_state->backtrace_pos >= BACKTRACE_BUFFER_SIZE) break;
    if (find_debug_info(p) != NULL)
      Caml_state->backtrace_buffer[Caml_state->backtrace_pos++] = p;


In no naked pointers mode, how can we be sure that p is not an integer value which turns out to have the same value than some bytecode instruction?

Excellent observation, thank you. This is a serious problem with this approach.

jhjourdan · 2020-06-16T13:49:12Z

runtime/backtrace_byt.c

+  while (sp < Caml_state->stack_high) {
+    if (sp == tr) {
+      tr = tr + Long_val(Trap_link_offset(tr));
+      sp += 4;


The behavior is different from the previous one. Could you document which stack cells you are ignoring here?

I got lost among the * so I tried to write equivalent code that would be clearer to me. (Basically I prefer in-out parameters to by-reference parameters.) Perhaps I failed. What did I change?

I have to say that I don't find any of the two versions very clear. This is probably because the problem to solve itself is not particularly easy.

Anyway, in the original version, we did not increment any pointer by 4, so I am surprised to see this in this version.

I should have commented. The idea was to skip the whole trap frame when we hit it. A trap frame is 4 words on the stack, and no code pointer can appear there.

jhjourdan · 2020-06-16T13:53:46Z

runtime/backtrace_byt.c

+#endif
+    p = (code_t) Foreign_ptr_val(v);
+    if (find_debug_info(p) != NULL) {
+      *sp_inout = sp; *trsp_inout = tr; return p;


Same remark as above. p could actually be an integer stored in the stack.

jhjourdan · 2020-06-16T14:16:53Z

There is an assert failure in debug mode reported by CI, so there is another bug lurking somewhere.

xavierleroy · 2020-06-16T14:24:28Z

There is an assert failure in debug mode reported by CI, so there is another bug lurking somewhere.

There remains debug assertions that are not compatible with no-naked-pointers. But, yes, I'll check that. Unless we decide to drop this approach immediately.

jhjourdan · 2020-06-16T15:17:26Z

Well, except if you find a fix for the callstack problem, I'm afraid we will need to drop it, indeed.

gasche · 2020-06-16T15:40:49Z

Note: in the taxonomy of #9680 we could combine 1a (as done here) with 2a (stack frame metadata), so that the stack remains a valid OCaml value (delimcc is happy) and we have precise stack types. This also avoids the performance worries with approach 2b. It sounds like more work, but it also goes in the same direction as the closure representation (better metadata to avoid dynamic checks).

xavierleroy · 2020-06-16T16:13:11Z

The "metadata" approach is a lot of work, at least if we want to follow the approach of the native-code compiler (with the statically-generated frame descriptors). I don't think the bytecode interpreter deserves that much effort. Also, if you have metadata (2a) you don't need 1a (encoding of code pointers).

gasche · 2020-06-16T19:12:12Z

Also, if you have metadata (2a) you don't need 1a (encoding of code pointers).

The point of also doing (1a) is to make the byte stack a valid array of OCaml values, to help delimcc -- as you pointed out. This being said, maybe just sticking an Abstract_tag on the stack copy would be fine. (I see interesting uses to the feature of being able to copy stack fragments on the heap, for example for algebraic effects, but we also want them to work with the native runtime, so it needs a more general solution than a bytecode-specific approach anyway.)

xavierleroy · 2020-06-17T12:19:00Z

This being said, maybe just sticking an Abstract_tag on the stack copy would be fine

Aaaaargh! The stack and its copy contain pointers into the heap that the GC must follow!!

One solution is to have two blocks for the copy of the stack, one tagged 0 and containing the values, the other tagged Abstract and containing the code pointers.

A variant is to have only one block, tagged 0, with code pointers being encapsulated as integers, and a separate array giving the positions of the code pointers in this block, so that they can be decapsulated correctly.

Yet another solution (which delimcc uses for natively-compiled OCaml code) is to copy to a malloc()-ed block (or just a bigarray) and register a special stack scanning action with the OCaml GC. (This can be done with hooks initially put there for the systhreads library.)

xavierleroy · 2020-06-17T12:19:47Z

I'm closing this PR because this is not a viable alternative to #9680 .

xavierleroy added 5 commits June 16, 2020 10:18

Add Val_foreign_ptr and Foreign_ptr_val macros for safe encapsulation…

fcbff58

… of aligned naked pointers In "naked pointers" mode, these are no-operations. In "no naked pointers" mode, the pointer (assumed 2-aligned) is disguised as an OCaml integer by setting its low bit to 1.

Encapsulate code pointers stored in the bytecode interpreter's stack

bdf7c2c

Use the new Val_foreign_ptr and Ptr_foreign_val conversions, which are no-ops in "naked pointers" mode, but tag the code pointers like OCaml integers in "no naked pointers" mode.

major_gc.c: use no-naked-pointers mode even in bytecode

3a8b795

Earlier, nonaked-pointer mode was effective only in native code.

xavierleroy mentioned this pull request Jun 16, 2020

Naked pointers and the bytecode interpreter #9680

Merged

xavierleroy added the multicore-prerequisite label Jun 16, 2020

jhjourdan requested changes Jun 16, 2020

View reviewed changes

xavierleroy closed this Jun 17, 2020

xavierleroy deleted the bytecode-nnp-alt branch July 20, 2020 09:34

xavierleroy mentioned this pull request Apr 10, 2022

Avoid pushing untagged code pointers to the bytecode stack #11169

Closed

Conversation

xavierleroy commented Jun 16, 2020

Uh oh!

jhjourdan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jhjourdan commented Jun 16, 2020

Uh oh!

xavierleroy commented Jun 16, 2020

Uh oh!

jhjourdan commented Jun 16, 2020

Uh oh!

gasche commented Jun 16, 2020

Uh oh!

xavierleroy commented Jun 16, 2020

Uh oh!

gasche commented Jun 16, 2020

Uh oh!

xavierleroy commented Jun 17, 2020

Uh oh!

xavierleroy commented Jun 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants