Skip to content

Add Obj.null#9535

Closed
kayceesrk wants to merge 3 commits intoocaml:trunkfrom
kayceesrk:obj_null
Closed

Add Obj.null#9535
kayceesrk wants to merge 3 commits intoocaml:trunkfrom
kayceesrk:obj_null

Conversation

@kayceesrk
Copy link
Copy Markdown
Contributor

@kayceesrk kayceesrk commented May 6, 2020

This PR adds a distinguished null value to the Obj module that has the same bit pattern as NULL in C (all bits set to 0 according to C99 standard).

The need for Obj.null

C bindings and low-level code sometimes have the need to represent a value that is either NULL or is a valid OCaml value. This is permitted by the OCaml compiler as the GC uses the page table to identify that NULL is outside the heap and skips following the object. However, with no-naked-pointers mode, where the GC only consults the page table for code pointers, NULL value to be a value on the major heap i.e, Is_block(NULL) && !Is_young(NULL) holds. This would lead to a crash if the GC follows the NULL pointer.

Implementation

In the naked pointer mode, the only places where the GC decides whether to follow a pointer is in caml_darken and mark_slice_darken. We add a check to see if the value is non-NULL before following it.

edit: In weak.c, the checks in Must_be_Marked_during_mark and Is_Dead_during_clean now check for NULL value.

Interaction with Multicore OCaml

This scheme is also compatible with the multicore compiler. Multicore currently uses a parallel minor GC, and Obj.null poses no special issues. The alternative concurrent minor GC uses a fast read barrier test to decide whether a pointer is in a foreign minor heap arena. See section 4.3.2 in the multicore GC paper.

Assuming a similar 16-bit address space, the precondition we need is that PQ(bx) != 0 /\ PQ(bx) != 1.

/* ax == 0 /\ PQ(bx) != 0 /\ PQ(bx) != 1 */
xor %bx , %ax
/* ax == bx /\ PQ(ax) != 0 /\ PQ(ax) != 1 */
sub 0x0010 , %ax
/* PQ(ax) != 0 */
test 0xff01 , %ax
/* ZF not set */

It is unlikely that the minor heap area gets allocated between at the very beginning of the virtual memory area. If we get unlucky, we'll just reserve that space and request another contiguous virtual memory chunk. In the worst case, on 64-bit address space, with the current multicore settings for the number of domains (128 max) with each domain having a maximum 16MB of minor heap arena, we may have to reserve (not allocate) 8GB of virtual address space. @stedolan.

@let-def
Copy link
Copy Markdown
Contributor

let-def commented May 6, 2020

That is indeed a very good idea for bindings. Thanks.

@bobot
Copy link
Copy Markdown
Contributor

bobot commented May 6, 2020

I'm surprised caml_darken and mark_slice_darken are the only place where it is followed. At least in weak.c there are others.

EDIT: Or I misunderstood the problematic cases.

@kayceesrk kayceesrk marked this pull request as draft May 6, 2020 06:36
@kayceesrk
Copy link
Copy Markdown
Contributor Author

@bobot Given that weak.c already has a distinguished caml_ephe_none, then all the cases that need to represent the non-existence of a key or a value in a weak array will use that? Can you point to ones that I'm missing? I can work on covering those.

@kayceesrk
Copy link
Copy Markdown
Contributor Author

@kayceesrk
Copy link
Copy Markdown
Contributor Author

I've covered those cases @bobot.

@xavierleroy
Copy link
Copy Markdown
Contributor

For the record: I'm not enthusiastic at the idea that NULL (the all-zero bit pattern) is guaranteed to be a well-formed OCaml value, even after we get rid of "naked pointers". In the motivating example, Val_unit would work just as well as a dummy initialization value. We don't have "null" values in the OCaml language (despite a past proposal on April 1st); why should we have them in the FFI?

@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented May 6, 2020

A value guaranteed not to overlap with any ordinary OCaml values would be a useful addition to Obj in general. Things like Base's Option_array could be given a simpler implementation and could support things like marshalling and polymorphic equality correctly.

I don't think that is the main motivation here, but I'd still be very grateful for it.

@xavierleroy
Copy link
Copy Markdown
Contributor

A value guaranteed not to overlap with any ordinary OCaml values would be a useful addition to Obj in general.

Except that if this particular value is made easily available under the name Obj,null, it becomes an ordinary OCaml value! Such "special" values need to be local to the modules that use them, and for that the trick with Abstract_tag blocks is better.

@kayceesrk
Copy link
Copy Markdown
Contributor Author

Noting down for the sake of completeness: Marshaling needs to be made aware of Obj.null value. Currently, marshaling an object with an Obj.null value raises exception Invalid_argument("output_value: abstract value (outside heap)").

@xavierleroy
Copy link
Copy Markdown
Contributor

Do we really want to be able to marshal Obj.null? And what about comparison, hashing, etc? In all these primitives, there is a path to handle naked pointers, and we were hoping to get rid of those code paths when naked pointers are disallowed, but with this PR the special path is back, just to handle Obj.null. I'm less and less enthusiastic.

@stedolan
Copy link
Copy Markdown
Contributor

stedolan commented May 6, 2020

Is the NULL pointer value actually used in many C bindings? The two examples from #9534 aren't terribly convincing.

In lablgtk, a null pointer is used in the implementation of Gpointer.optstring. However, Gpointer.optstring is only used once, in the implementation of CList.insert which wraps gtk_clist_insert. It's probably easy to change one wrapper function?

In frama-c, the following snippet exists:

let zeroword = Obj.field (Obj.repr 0L) 0;;
let null = zeroword;;

This does not in fact create a zero word, as only the second word of Obj.repr 0L is zero. This actually extracts a naked pointer to the structure caml_int64_ops. The fact that this code works implies that it doesn't care much what this value is, so presumably let null = Obj.repr (ref []) would work just as well.

@kayceesrk
Copy link
Copy Markdown
Contributor Author

I am yet to see an example that requires Obj.null, which cannot be expressed without it. I am not convinced that we need Obj.null.

@lthls
Copy link
Copy Markdown
Contributor

lthls commented May 7, 2020

The frama-c unmarshal.ml file is really a good example of unsafe things happening in the wild.
The endian computations actually cause flambda2 to assume that the program could do anything and may as well segfault at toplevel, saving the trouble of compiling the rest of the file.

@xavierleroy
Copy link
Copy Markdown
Contributor

The frama-c unmarshal.ml file is really a good example of unsafe things happening in the wild.

Cc the authors @pascal-cuoq and @damiendoligez . Yet, to me, this code is one of the very few legitimate uses of the Obj module around. Where do we go from here? I don't know.

@lthls
Copy link
Copy Markdown
Contributor

lthls commented May 7, 2020

Just to clarify, the functions and structures in this module are fine, and handled correctly by flambda2. The problematic parts are the endianness computations and the zeroword definition, for which a C stub would have been much better in my opinion (and for integer endianness, at least, it's available as Sys.big_endian since OCaml 4.00).

@xavierleroy
Copy link
Copy Markdown
Contributor

@kayceesrk , @stedolan and I have expressed doubts on the usefulness of Obj.null. @kayceesrk, is it OK with you to close this PR?

@kayceesrk
Copy link
Copy Markdown
Contributor Author

Yes. Let's close this PR since we haven't found a convincing usecase for null yet.

@kayceesrk kayceesrk closed this May 14, 2020
@lpw25
Copy link
Copy Markdown
Contributor

lpw25 commented Aug 27, 2020

This issue came up again in @gadmm's talk at the ML Workshop. It sounded like he had some examples of existing code using null as a naked pointer. @gadmm Do you have such examples? If so we might want to re-consider including the check for NULL in caml_darken and mark_slice_darken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants