Skip to content

Improved register typing#1192

Closed
mshinwell wants to merge 7 commits intoocaml:trunkfrom
mshinwell:nnp-const_symbol
Closed

Improved register typing#1192
mshinwell wants to merge 7 commits intoocaml:trunkfrom
mshinwell:nnp-const_symbol

Conversation

@mshinwell
Copy link
Copy Markdown
Contributor

I came across a case today, when testing with "no naked pointers" enabled, where code pointers had been spilled onto the stack (prior to being used for building closures) and registered as GC roots. This is not permissible under the current no naked pointers rules.

I think it is probably unnecessary to register symbols as GC roots within functions in any case. The addresses of symbols are invariant under GC. As far as I know the only symbols whose contents need to be scanned are the Flambda Initialize_symbol ones---and those are all referenced from the global GC roots tables anyway.

I suspect we can fix these problems by assigning Cconst_symbol expressions the "int" instead of the "val" register type. Does anyone know of any reason why this might not be safe?

What happens during instruction selection needs careful reasoning about. If there are two branches and a join point then the code now takes the least upper bound of the register types from the branches. This seems necessary for correctness of this change. It may mean that a symbol gets registered as a GC root when not strictly required (for example if one branch of a conditional dynamically constructs a pair and the other branch returns a statically-allocated pair). Could it lead to a situation where a register is registered as a root and doesn't contain a valid value? I'm not completely certain but the only case for such invalid values might be that of code pointers. In that case I don't see how the situation could arise.

We could maybe enhance the register typing to at least enable us to catch these cases during compilation. For example we could have "static data that cannot be scanned" which would result in an error if it occurs at a join point with any other register type. We could add "static data that can be scanned" for completeness, perhaps.

@mshinwell mshinwell requested a review from chambart June 6, 2017 13:37
Some(self#insert_op (Iconst_float (Int64.bits_of_float n)) [||] r)
| Cconst_symbol n ->
let r = self#regs_for typ_val in
let r = self#regs_for typ_int in
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why this works is quite intricate, this certainly need some comment to explain it (if you don't end up adding a new register type to represent that).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I had a crash with 4.08 and no-naked-pointers that we narrowed down to another instance of this bug with the help of @mshinwell

@chambart
Copy link
Copy Markdown
Contributor

chambart commented Jun 6, 2017

By the way, this also suggest that there is some problem on the ordering of instructions: It is probably better in the situation you encountered to delay loading rather than spilling.

@mshinwell
Copy link
Copy Markdown
Contributor Author

Yes, I did wonder about that too. I can investigate that aspect further.

@chambart
Copy link
Copy Markdown
Contributor

chambart commented Jun 6, 2017

The lub of a register of type "static data that cannot be scanned" and int should be "static data that cannot be scanned" I think, and not an error. But otherwise, this seems like a sensible addition to the register type lattice.

I can imagine one situation where joining might occur, but not with the current compiler: If we add some cmm level unboxing of allocated blocs, (with some gadts to allow typing that, but lets assume we did that correctly):

let block =
  match x with
  | A -> (b, c)
  | B -> fun d -> d + e
in
match x with
| A -> fst block
| B -> block 3

if it gets unboxed in some way, we might end up with the variable b ending up in the same register as the code pointer for the function.

@mshinwell
Copy link
Copy Markdown
Contributor Author

This makes me think that spending more time trying to optimise GC headers before function bodies may in fact be worthwhile.

@chambart
Copy link
Copy Markdown
Contributor

chambart commented Jun 6, 2017

Optimise ?

@mshinwell
Copy link
Copy Markdown
Contributor Author

Yeah, I think the code size penalty is too great at present. (The work is in #203 and the results discussed in #1156 )

@xavierleroy
Copy link
Copy Markdown
Contributor

I believe the proposed change (Cconst_symbol of type Tint) is sound. Still, it is weird that a pointer to the code of a function ends up spilled.

@mshinwell
Copy link
Copy Markdown
Contributor Author

mshinwell commented Aug 2, 2017

I decided to try to improve the register typing so that we can solve this problem in a more obviously correct manner. The results are now in this pull request.

In the new world, there are two parts to a register type, as seen in the diff for cmm.mli. The first part relates to register allocation:

  1. Whether the value needs to be in an integer or floating-point register.
  2. If the value is in an integer register, what the behaviour of the GC should be upon the contents of such register.

The second part is called the "GC action" and has the following possibilities.

  1. "Must scan". This is the usual case for registers holding OCaml values.
  2. "Can scan". This is used for tagged OCaml integers and out-of-heap pointers to blocks that look like OCaml values. Such things do not have to be scanned, but may be scanned safely, by the GC.
  3. "Cannot scan". Used for arbitrary pointers (which are allowed to be live across a GC). For example, the pointer to a function's code, or a pointer to caml_backtrace_pos as in Cmmgen.raise_regular. In the future, we expect Flambda to be able to unbox int32 and int64 values, which will I imagine also produce registers of this type.
  4. "Cannot be live at GC". (Edit: these have all the properties of "Cannot scan" too.) Registers with this type will cause a fatal error if they are live at a GC point. An example is a derived pointer into the middle of a block in the heap, for example as a result of array indexing.

At join points (and assignments) an operation is used, as at present, that takes the least upper bound of these actions. Here is the ordering:

      Cannot_be_live_at_gc
              ^
            /   \
           /     \
          /       \
         /         \
        /           \
    Must_scan   Cannot_scan
        ^           ^
         \          /
          \        /
           \      /
            \    /
           Can_scan

For example, a conditional with one arm returning "must be scanned" and another arm returning "can be scanned" will cause the result register to be eligible for root registration. A fatal error will be produced at such points if an attempt is made to join "must be scanned" with "cannot be scanned".

The careful distinctions here enable the compiler code to be more explicit, easier to understand and robust. As an example, bigarray indexing calculations currently produce register type Int, even though the calculated value is a pointer. This has two problems:

  1. Even though it "should not happen", there is no check that such a register doesn't appear (perhaps due to a subtle mistake in the code) at a join point and be promoted to register type Val. That could cause a crash. In the new world, bigarray indexing produces registers with GC action Cannot_scan, which if joined with Must_scan (the equivalent of Val) will cause a fatal error. (Another example is registers holding sub-word quantities read from bigarrays before boxing; these are now marked as Cannot_scan.)
  2. The existing code assumes that registers of type Int have width Arch.size_int whilst (as far as I can tell) leaving open the possibility of having Arch.size_int be smaller than Arch.size_addr. We have a pointer in that register...

As far as point 2 goes, this patch adds an assertion that Arch.size_int is equal to Arch.size_addr; I believe this is the case for all platforms we currently support. This restriction avoids having to subdivide Can_scan.

This patch reduces the number of Cmm language constructs by coalescing the various "add" operations into a single operation equipped with its result type. A similar simplification has been made to memory chunks; and it is now explicit that any sub-word memory chunk is of type "cannot be scanned". I also took the liberty of renaming Ccmpi to Ccmps and Ccmpa to Ccmpu to reflect that what's happening here is the difference between signed and unsigned comparison rather than anything to do with integers or derived pointers.

The patch annotates Cconst_symbol, which was the original point of this pull request, with a description of what is being pointed at. This description is effectively a subset of the register type. This solves the original problem of needless (or incorrect) scanning of symbols. This also allows situations such as a constant pair on one side of a conditional and a computed pair on the other to be described accurately. (The constant pair will have to put up with being scanned.) It also ensures that registers holding the addresses of symbols that are not scannable (e.g. caml_backtrace_pos, the AFL instrumentation block, etc.) cannot silently flow to join points where they may be incorrectly promoted to roots.

The patch clarifies that the int_as_pointer primitive is only to be used when the result is a scannable pointer (even if outside the heap). This is in line with the general methodology that should be applied to Cannot_scan and Cannot_be_live_at_gc pointers: since these can cause compile-time fatal errors at the least upper bound check and in the emitter, respectively, registers of these types should only be introduced as short-lived temporaries for the compilation of some primitive (or similar). As above, this is now checked.

Various other potentially dubious pieces of code have been fixed as a result of these changes:

  1. The Spacetime instrumentation code was using Caddi wrongly.
  2. Hard registers have been marked as Must_scan (previously they were Int). These shouldn't end up being scanned, of course, but it seems more correct like this. Maybe I am wrong.
  3. There was an incorrect assertion in Selectgen.
  4. It looked like there was an overly-specific Cstore pattern in select_operation for x86-64.

I need to make another pass over Cmmgen and work a bit more on this patch, together with making the necessary small changes to the non-x86-64 emitters, before it is ready. However I would appreciate any opinions on the general structure of this work now. (Particularly from @xavierleroy :) )

@mshinwell mshinwell changed the title Use "int" register type for [Cconst_symbol] expressions Improved register typing Aug 2, 2017
@damiendoligez damiendoligez added this to the 4.07-or-later milestone Sep 25, 2017
@damiendoligez damiendoligez removed this from the consider-for-4.07 milestone Jun 5, 2018
| Caddi | Csubi | Cmuli | Cmulhi | Cdivi | Cmodi
| Cadd of gc_action (* the [gc_action] gives the type of the result *)
| Csubi | Cmuli | Cmulhi | Cdivi | Cmodi
| Cand | Cor | Cxor | Clsl | Clsr | Casr
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've just realised that more (all?) of the arithmetic operations will need altering so that they specify the type of the thing (e.g. tagged integer, naked int32, etc...) that they are working on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants