Trigger a minor GC when custom blocks accumulate in minor heap#1476
Trigger a minor GC when custom blocks accumulate in minor heap#1476damiendoligez merged 6 commits intotrunkfrom
Conversation
When a custom block is allocated with caml_alloc_custom, one specifies a ratio (mem/max) which is used to increase the pressure on the major GC. But before this commit, there is no logic to put some pressure on the minor GC. Consequently, a trivial loop that allocates rather large bigarrays and throw them away immediately will trigger out-of-memory on systems with constrained memory, and in particular on 32-bit systems. Before OCaml 4.03, custom objects with finalizers (which include bigarrays) were never allocated in the minor heap, thus reducing the risk of hitting the problem. This commit replicates the logic used to trigger the major GC: when the sum of ratios corresponding to custom blocks allocated in the minor heap reaches 1, a minor collection is forced.
byterun/custom.c
Outdated
| add_to_custom_table (&caml_custom_table, result, mem, max); | ||
| /* Keep track of extra resources held by custom block in | ||
| minor heap. */ | ||
| caml_extra_heap_resources_minor += (double) mem / (double) max; |
There was a problem hiding this comment.
I am not knowledgeable enough to opine on the proposed strategy, but shouldn't we protect against invalid values of mem and max ? (e.g. max == 0)
There was a problem hiding this comment.
I've added some protection against mem = 0 (just an optimization) and max = 0, setting it to 1 instead as caml_adjust_gc_speed does. That function also truncates mem to be <= max, but this seems useless (it would be, strictly speaking, if the comparison for the threshold were >= 1. instead of > 1). Note that both mem and max are unsigned integer, so no need to deal with negative values.
There was a problem hiding this comment.
max = 0, setting it to 1
Actually, this might be useless as well (just let the ratio be +infinity, which would trigger the minor gc immediately; this is also the case if max = 1, mem > 0 anyway).
|
FTR, @lpw25 suggested to take the counter for custom blocks in the major heap into account as well to trigger the minor GC (i.e. consider the sum of ratios for custom blocks in the minor+major heaps). The change in the code is trivial. I'll let @damiendoligez comment on the preferred heuristics. |
|
I don't think we should take the major heap counter into account, since it won't be reset by a minor collection. If we do, then when that counter is almost full we'll do more minor collections than necessary. |
|
@damiendoligez Apart from this point, do you have an opinion on the overall proposal? |
damiendoligez
left a comment
There was a problem hiding this comment.
The change looks good in principle, but you are using a deprecated function to trigger the minor GC.
byterun/caml/minor_gc.h
Outdated
| asize_t, asize_t); | ||
| extern void caml_oldify_one (value, value *); | ||
| extern void caml_oldify_mopup (void); | ||
| extern void caml_minor_collection (void); |
There was a problem hiding this comment.
You shouldn't export this function, it's deprecated.
byterun/custom.c
Outdated
| if (max == 0) max = 1; | ||
| caml_extra_heap_resources_minor += (double) mem / (double) max; | ||
| if (caml_extra_heap_resources_minor > 1.0) { | ||
| caml_minor_collection(); |
There was a problem hiding this comment.
You should call
caml_request_minor_gc ();
caml_gc_dispatch ();
as done in array.c to trigger a minor collection.
Whilst just taking the major counter into account causes this bad behaviour, it still seems wrong to me that you can now use up twice the user specified limit before trying to collect. Is there nothing we can do to try and obey the user's limit? The pathological case is when 99% of the limit is used up on the major heap and 1% on the minor heap -- which if we just take both limits into account will cause us to do a minor collection for every new 1% of the limit allocated. What is the behavior we actually want in this case? Clearly doing a major collection could free up more of the limit and allow for longer between minor collections, but how do we make that trade-off. One possibility might be to try to keep the ratio between major limit and minor limit proportional to the relative sizes of the major and minor heaps. |
|
(I have not been involved in this discussion previously because I didn't understand some basic things, one of them corresponding to Damien's remark above: I didn't understand why people wanted to involve the major heap limit in deciding to make minor collections.)
This would make the minor limit very small in practice in most cases, meaning that it would be hard to allocate custom blocks without doing a lot more minor collections. |
Considering that the current situation is much worse than that (since 4.03 at least) without producing many complaints, I'm tempted to believe that this factor of 2 is ok. |
Some design space:
It sounds much simpler to have separate minor custom limit and major custom limit, and let users choose both, as a more fine-grained alternative to picking only a total limit. We could pick a default ratio between the two (say, 1/5), and by default set the two so that they sum up to the current total limit. |
|
Refined proposal: we could have a "global limit", and then for each heap a "soft limit". We only force a collection when the total custom space reaches the global limit, but in that case we collect all the heaps that have reached their soft limit. This way,
A default choice for the minor and major soft limits still needs to be decided. |
|
I'm surprised by the absence of an enthusiastic response to my "one hard limit, two soft limits" proposal, which still sounds excellent to me :-) @alainfrisch, @damiendoligez, @lpw25, are you less convinced? Should I try to propose a PR? |
Honestly, I've no strong opinion. It might be better than the proposed approach, but it's also more complex and it changes the public API. The current state of the PR is, I believe, a strict improvement over the current situation, it fixes issues observed in the wild, and it has been more or less blessed by @damiendoligez . I think we should merge, and if time permits to find something even better before the next release, we can do that in another PR. |
I'm fine with that.
That sounds like a good approach to me. |
|
@gasche Are you also ok with the incremental approach? @damiendoligez I've addressed the "requested change" (not using a deprecated function). Can you review again? |
|
I certainly agree that consuming, at most, twice the desired size (which is the aspect of the proposed behavior which is disappointing) is better than OOM-ing. Note: the soft-limit does not need to change the public API; for example, we can set 1/3 for both soft limits ratios, and we get something essentially similar to your proposal (except that workflows that allocate on both minor and major heaps collect more often to stay below 1.0 utilization, instead of having 2.0 utilization in the worst case). |
|
Gentle ping @damiendoligez |
| caml_request_minor_gc (); | ||
| caml_gc_dispatch (); | ||
| } | ||
| } |
There was a problem hiding this comment.
I think it could be nice to encapsulate this logic in a separate function like caml_adjust_gc_speed (caml_adjust_minor_gc_speed?). Also, comparing with the major-heap function suggest that maybe including CAML_INSTR_* code could be useful.
You can't decide to collect the major heap because the major GC is incremental so the major heap is being collected continuously. You need to decide how much work the major GC will do for each word allocated by the program. I'm not sure how that would work with your soft limit idea. |
…#1476) When a custom block is allocated with caml_alloc_custom, one specifies a ratio (mem/max) which is used to increase the pressure on the major GC. But before this commit, there is no logic to put some pressure on the minor GC. Consequently, a trivial loop that allocates rather large bigarrays and throw them away immediately will trigger out-of-memory on systems with constrained memory, and in particular on 32-bit systems. Before OCaml 4.03, custom objects with finalizers (which include bigarrays) were never allocated in the minor heap, thus reducing the risk of hitting the problem. This commit replicates the logic used to trigger the major GC: when the sum of ratios corresponding to custom blocks allocated in the minor heap reaches 1, a minor collection is forced.
Attempt to address MPR#7100.
When a custom block is allocated with caml_alloc_custom, one specifies
a ratio (mem/max) which is used to increase the pressure on the major
GC. But before this PR, there was no logic to put some pressure on
the minor GC. Consequently, a trivial loop that allocates rather
large bigarrays and throw them away immediately will trigger out-of-memory
on systems with constrained memory, and in particular on 32-bit systems.
Before OCaml 4.03, custom objects with finalizers (which include bigarrays)
were never allocated in the minor heap, thus reducing the risk of hitting
the problem.
This PR replicates the logic used to trigger the major GC: when
the sum of ratios corresponding to custom blocks allocated in the
minor heap reaches 1, a minor collection is forced.
This also fix MPR#7671, although this
is still mysterious to me (since the repro case for that one forced a GC compact before the final allocation that triggered the OOM).