Remove unnecessary regions for %int_as_pointer#2677
Remove unnecessary regions for %int_as_pointer#2677mshinwell wants to merge 3 commits intooxcaml:mainfrom
Conversation
|
I'm not sure it causes any issues currently, but I want to clarify what the region is for:
The aim is to ensure that the GC can never see the result of this primitive after the end of the region. It is safe for the GC to see the result before then, and indeed I would expect it to do so in a lot of cases where we pass the value into some other function that takes a local argument. It is also safe for the GC not to look at the result at all -- so having it be Note that this means you can get code like: type t = { res : int }
let id x = x
[@@inline never]
let foo x =
let y = int_as_pointer x in
let z = id y in
z.res
let bar arg =
let a = malloc arg in
let res = foo a in
free a;
reswhere it is important that the GC does not look at |
|
Ah yes, indeed these cases with I have wondered about keeping the regions right until the end of Cfg, at which point they could be deleted. However this requires changing the Mach/Cfg operations to hold onto the region registers somehow (and also to add an int_as_pointer Cmm primitive etc). It seems like that could maybe affect code generation, though I would have to discuss with Greta and Xavier and think about this further. |
|
@chambart suggested to use other variables with kind Region (I think it's also what Leo suggested above). Every function would have two region parameters: one for stack allocations, and one for scoping only. When translating from lambda, we would insert |
|
See #2682 |
|
Superceded by #2682 |
This PR aims to correct bad code generation when
%int_as_pointeris used at local mode, in a context where the corresponding region is not required for any actual stack allocation. For example in the following (which can be compiled with the boot compiler using-nostdlib -nopervasives):we would like it to be the case that the code for
setis the same as fornormal_setexcept for one subtraction (corresponding to%int_as_pointer). This is the case after this PR. At the moment it is not the case, with the generated code loading and storing the local stack pointer.The
%int_as_pointerprimitive should be (and is) treated in the main part of flambda2 (everything beforeTo_cmm) a bit like a zero-sized stack allocation. The aim is to ensure that the primitive is not pushed past the end-region, just like any other local allocation. (At present I think this is really enforced by the fact thatEnd_regionis a hard barrier inTo_cmm, but with a more advanced effect analysis in that pass in the future, the region variable named by theInt_as_pointerprimitive will ensure this.)Not pushing
%int_of_pointerpast an end-region is actually a proxy for the real property we are trying to ensure, which is that the GC can never see the result of this primitive (as it is untagged and could cause misbehaviour, especially with a no-naked-pointers compiler or runtime5).The idea of this patch, then, is to delete "unused" regions during
To_cmm. The definition of "unused" here is "not needed to do any stack allocation", so in particular, a region only held onto viaInt_as_pointeris to be deleted. Now it seems like there might a risk of code motion transformations (now or in the future) in the backend needing the region in order to ensure soundness. Indeed, there is arguably a pre-existing problem here, becauseInt_as_pointeris expanded into Cmm code that does not name any region variable duringCmm_helpers. This patch aims to adopt a straightforward approach which I hope is going to work when tested on a large scale: make that Cmm code have return machtypeAddr. This in theory would allow the backend to move the primitive past anEnd_region, although I don't think there is currently any code in the compiler which would do this, but it does ensure (at compile time) the above property relating to the GC. This is the critical part.The patch itself is fairly easy. There are probably three PRs here to be split out:
Cmm_helpers.int_as_pointerreturn something of machtypeAddr.To_cmmtraverse the flambda2 terms in dominator order, so something straightforward can be done in point 3 below. I think this is probably a good change anyway. Some more thinking needs doing about the handling of loops (especially in conjunction with point 3. below), I just haven't had time yet.To_cmm_resultwhich regions have been seen to be used, and don't emit Cmm code forEnd_regionprimitives if the corresponding region is unused. It's a bit unfortunate that this seems to involve passing theSimple.tfor the argument to the unary primitive simplification function, but this does seem to be much better than trying to do this based on the Cmm variables (asIdent.t). I tried that latter way and it got complicated very quickly.I think the
%int_of_pointertest at local mode in the testsuite may not really be testing the right thing, but I will discuss with @riaqn .