use [raw_field] primitives in camlinternalMod by gasche · Pull Request #9691 · ocaml/ocaml

gasche · 2020-06-17T12:22:29Z

Looking at #9690 (changes to caml_{alloc,update}_dummy for the new closure representation of #9619), I have the impression that camlinternalMod.update_mod needs to be updated as well. Here is a naive proposal.

(cc @xavierleroy @jhjourdan)

gasche · 2020-06-17T12:26:50Z

stdlib/camlinternalMod.ml

+  assert (Obj.size o >= Obj.size n);
+  for i = 0 to Obj.size n - 1 do
+    Obj.set_raw_field o i (Obj.raw_field n i)
+  done


I'm not sure how to reason about correctness on fields that are not the code pointer. I have the impression that we are doing a raw write, when we probably should use a proper write barrier. If this is indeed the case, then the fix would be to loop over the precise closure representation, instead of handling all fields as raw data. (I don't think there is OCaml-side code doing this yet, and it might benefit from some auxiliary functions in Obj.)

I added a type Obj.Closure.info = { arity: int; start_env : int }, which is now used to use set_raw_field on closure metadata fields and set_field on the others.

You're right that closures need as special case. However this code is not good for the reason you mention.

Here I think it's enough to copy field 0 using raw-field accesses and all other fields using normal accesses. In block o there is only one code pointer at offset 0, the rest is integers. As mentioned elsewhere, caml_modify (i.e. Obj.set_field) of a code pointer over an integer is safe.

In your new code you still need to justify why you're taking the startenv of the new block and not the one of the old block.

In block o there is only one code pointer at offset 0, the rest is integers.

My understanding is that overwrite_closure may be called with n an arbitrary closure and o being the template value of the init_mod case:

Obj.repr (fun _ -> raise (Undefined_recursive_module loc))

Here we have a non-integer closure field for loc.

In your new code you still need to justify why you're taking the startenv of the new block and not the one of the old block.

Can I start by filling the segment [o.start_env; n.start_env] with unit values?

Well spotted. I stand corrected. But I think this is also a problem with your approach.

There are TWO kinds of assignments between fields that must be avoided:

assigning via caml_modify (i.e. Obj.set_field) when the old value is a code pointer

assigning directly (i.e. Obj.set_raw_field) when the old value is a pointer within the major heap, because then it may escape marking even though it is live elsewhere. (This is why caml_modify does a caml_darken on the old value.)

If you Obj.set_raw over the field of o that contains the loc free variable, you're running into the second forbidden case.

One solution is to first set fields 1...size - 1 of o to an integer value

Obj.set_field o i (Obj.repr 0)

so that the loc value is properly seen by the GC.

Then you can do your copy with raw assignments up to startenv(n), and normal assignments above.

let n_start_env = Obj.Closure.((info n).start_env) in let o_start_env = Obj.Closure.((info o).start_env) in (* if the environment of n starts before the one of o, clear the raw fields in between. *) for i = n_start_env to o_start_env - 1 do Obj.set_raw_field o i Nativeint.one done; (* if the environment of o starts before the one of n, clear the environment fields in between. *) for i = o_start_env to n_start_env - 1 do Obj.set_field o i (Obj.repr ()) done; for i = 0 to n_start_env - 1 do (* code pointers, closure info fields, infix headers *) Obj.set_raw_field o i (Obj.raw_field n i) done; for i = n_start_env to Obj.size n - 1 do (* environment fields *) Obj.set_field o i (Obj.field n i) done; ()

what do you think?

gasche · 2020-06-17T13:18:17Z

The Obj.Closure submodule in this PR could possibly be extended a bit, but carefully. At first I defined an abstract type code_ptr with code_ptr : Obj.t -> code_ptr and set_code_ptr : Obj.t -> code_ptr -> unit with the trace code in mind. Then I realized that the setter might distress the multicore and/or flambda people, so I got rid of it; for now it only exposes read-only operations.

xavierleroy

overwrite_closure is getting pretty robust, that's nice. The last set of assignments could be avoided (see below), but that's a performance optimization, not a correctness issue.

I'm afraid there is still an issue with the construction of the initial, dummy closure.

stdlib/camlinternalMod.ml

jhjourdan · 2020-06-17T13:56:56Z

@gasche: I see you mentioned me. However, I don't know anything about how camlinternalMod works, and I don't have a good understanding of #9619.... Hence I don't feel like I have expertise here.

gasche · 2020-06-17T13:57:57Z

Hence I don't feel like I have expertise here.

Yet!

xavierleroy

Looks good, except a signed/unsigned error in the extraction of the "arity" field.

xavierleroy · 2020-06-17T14:51:43Z

stdlib/obj.ml

+    (* the nativeints below are unsigned, but we know they can
+       always fit an OCaml integer so we use [to_int]
+       instead of [unsigned_to_int]. *)
+    let arity =
+      if Sys.word_size = 64 then
+        to_int (shift_right_logical info 56)
+      else
+        to_int (shift_right_logical info 24)


Arity is signed (negative arity = tupled function), so please use shift_right and update the comment.

Thanks! This should now be fixed.

xavierleroy · 2020-06-17T14:52:37Z

runtime/obj.c

  }

+  if (tg == Closure_tag) {
+    /* Closinfo_val is the seconnd field, so we need size at least 2 */


Typo: "seconnd".

Cosmetic: switch (tg) { case Closure_tag: ... case String_tag: ... } could look good. Maybe.

I did the switch (haha) in a separate commit, let me know if you like it.

stdlib/camlinternalMod.ml

gasche · 2020-06-17T20:56:57Z

I fixed the remaining comments, rebased the commit history, and added a Changes entry (common with #9690 and #9681).

xavierleroy

The new "switch" is good, but carried you too far! See explanation below.

xavierleroy · 2020-06-18T08:45:53Z

runtime/obj.c

+  case Custom_tag: {
+    /* It is difficult to correctly use custom objects allocated
+       through [Obj.new_block], so we disallow it here. The first
+       field of a custom object must contain a valid pointer to
+       a block of custom operations. Without initialisation, hashing,
+       finalising or serialising this custom object will lead to
+       crashes.  See #9513 for more details. */
+    caml_invalid_argument ("Obj.new_block");
+  }


I'm sorry to be difficult, but the Custom_tag case must be rejected BEFORE doing caml_alloc. The reason is as follows: if the block is big enough, it is allocated in the major heap, with tag Custom_tag but an invalid "operations" field. Then, we raise an exception, so the block is unreachable. So, at the next major GC sweep, it will be collected, and the sweeper will look into the "operations" field to see if there is a finalization function. This will segfault.

I see. I noticed that the allocation would happen, but I figured that this would be fine given that this exception corresponds to a programming error, that is not supposed to be handled by the program. Your point about the segfault is very reasonable :-)

For the other failure cases in the function, there is no such issue: there are failure cases that can lead to ill-formed values on the heap, but they are not ill-formed enough that it would be a problem for the GC, which is the only case we have to consider for this allocation that is dead on arrival.

@xavierleroy I moved back the Custom_tag check, but then I pushed a second commit that moves the allocation within the switch, so that it is always after the failure test. Let me know what you prefer.

xavierleroy

I'm approving because the code looks correct to me and to stop you from torturing it some more :-)

This change is careful to avoid writing a value into what was previously a raw field or conversely, clearing fields that change category first. We also clear the end of the block, to make it easier to reason about lifetime of values that could have been referenced there. (We don't expect this to make a different in practice.)

xavierleroy · 2020-06-18T13:25:24Z

There was a suspicious Travis failure before, so let's wait until T&A succeed.

gasche · 2020-06-18T13:26:53Z

The CI was failing with a weird failure (it claimed that it did not know about to_int but there was an open Nativeint above). The Obj->Nativeint dependency was missing in the .depend, so I refreshed it, but I am skeptical that this was the reason.

gasche · 2020-06-18T15:06:32Z

Now Travis is green but AppVeyor fails (when trying to build FlexDLL, it seems) with a strange error that does look related to this PR (but may be an issue with the PR that introduced Obj.raw_field, or the AppVeyor setup):

File "reloc.ml", line 1:
Error: Error while linking ../boot\stdlib.cma(Stdlib__obj):
       The external function `caml_obj_raw_field' is not available
make[1]: *** [Makefile:140: flexlink.exe] Error 2

xavierleroy · 2020-06-18T15:09:19Z

I'll run CI precheck on this PR. Concerning the Appveyor failure, I don't know the details of the "build FlexDLL while we're building OCaml" procedure, so I'm mentioning @dra27.

gasche · 2020-06-18T19:32:48Z

(precheck seems happy, the only failure was also present in other builds.)

xavierleroy · 2020-06-19T15:38:53Z

It looks like the "build FlexDLL while we're building OCaml" procedure assumes a properly-bootstrapped system and will fail if boot/ocamlc doesn't know about newly-added primitives. I'll discuss that on caml-devel.

gasche · 2020-06-23T12:40:12Z

Note: the flexdll build failure was fixed in #9700.

This module was introduced in #9691, for use in CamlinternalMod, but rendered obsolete by #10205.

This module was introduced in ocaml#9691, for use in CamlinternalMod, but rendered obsolete by ocaml#10205.

gasche force-pushed the closure-repr-camlinternalMod branch from 638bdb3 to 0905589 Compare June 17, 2020 12:23

gasche commented Jun 17, 2020

View reviewed changes

xavierleroy reviewed Jun 17, 2020

View reviewed changes

stdlib/camlinternalMod.ml Show resolved Hide resolved

stdlib/camlinternalMod.ml Show resolved Hide resolved

gasche force-pushed the closure-repr-camlinternalMod branch from 9b0bfae to 367e7e4 Compare June 17, 2020 14:20

xavierleroy reviewed Jun 17, 2020

View reviewed changes

gasche force-pushed the closure-repr-camlinternalMod branch from 367e7e4 to 65270a9 Compare June 17, 2020 20:48

xavierleroy reviewed Jun 18, 2020

View reviewed changes

gasche force-pushed the closure-repr-camlinternalMod branch from 7a2ca1d to df5658d Compare June 18, 2020 12:21

xavierleroy approved these changes Jun 18, 2020

View reviewed changes

gasche added 3 commits June 18, 2020 15:23

Obj: basic access to closure metadata

bac2e05

ensure that Obj.new_block returns a sensible uninitialized closure

d1faa2a

gasche force-pushed the closure-repr-camlinternalMod branch from df5658d to 4cde684 Compare June 18, 2020 13:23

gasche added 3 commits June 18, 2020 15:24

caml_obj_block: use a switch on the tag

f5164ca

Changes

2ef50f9

caml_obj_block: move the allocation within the switch

514d2ff

gasche force-pushed the closure-repr-camlinternalMod branch from 4cde684 to 514d2ff Compare June 18, 2020 13:24

xavierleroy mentioned this pull request Jun 18, 2020

usability issue: no error when opening an alias to a missing module #9695

Closed

xavierleroy merged commit bdbf5c3 into ocaml:trunk Jun 19, 2020

lthls mentioned this pull request Oct 3, 2023

Remove the Closure module from Obj #12625

Merged

xavierleroy pushed a commit that referenced this pull request Oct 4, 2023

Remove the Closure module from Obj (#12625)

1649965

This module was introduced in #9691, for use in CamlinternalMod, but rendered obsolete by #10205.

sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Oct 8, 2023

Remove the Closure module from Obj (ocaml#12625)

206ad4a

This module was introduced in ocaml#9691, for use in CamlinternalMod, but rendered obsolete by ocaml#10205.

Conversation

gasche commented Jun 17, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gasche Jun 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gasche commented Jun 17, 2020

Uh oh!

xavierleroy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jhjourdan commented Jun 17, 2020

Uh oh!

gasche commented Jun 17, 2020

Uh oh!

xavierleroy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gasche commented Jun 17, 2020

Uh oh!

xavierleroy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xavierleroy left a comment

Choose a reason for hiding this comment

Uh oh!

xavierleroy commented Jun 18, 2020

Uh oh!

gasche commented Jun 18, 2020

Uh oh!

gasche commented Jun 18, 2020

Uh oh!

xavierleroy commented Jun 18, 2020

Uh oh!

gasche commented Jun 18, 2020

Uh oh!

xavierleroy commented Jun 19, 2020

Uh oh!

gasche commented Jun 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

gasche Jun 17, 2020 •

edited

Loading