Conversation
|
This should also hopefully address MPR#7100. |
|
Not strictly speaking related to the proposed change, but I think it's good to maintain an overall understanding of how custom blocks are handled by the GC. Could you explain the following heuristics in caml_adjust_gc_speed: if (caml_extra_heap_resources
> (double) caml_minor_heap_wsz / 2.0
/ (double) caml_stat_heap_wsz) {
CAML_INSTR_INT ("request_major/adjust_gc_speed_2@", 1);
caml_request_major_slice ();
}As the heap grows during the life of the program, this will trigger more and more often. For instance, each i/o channel allocation (currently with ratio 1/1000) will trigger a major slice as soon as: and this can easily happen. Does it really make sense to trigger GC slices more often when the heap grows for a constant allocation rate of custom blocks? |
|
A bit of digging with |
|
On Thu, Apr 26, 2018 at 4:10 PM Damien Doligez ***@***.***> wrote:
A bit of digging with git blame reveals that this heuristic was already
present in 1995 in the very first commit in OCaml's history, so it dates
back to Caml Light... It's probably time to re-assess because I really
don't remember the reason for this.
If you're in the mood for archeology, the Caml Light sources (with CVS
history) are here: https://github.com/camllight/camllight
Enjoy :-)
- Xavier
… —
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1738 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADqusmcekZprjpin0SHzD5fzwmqyEQsVks5tsdVdgaJpZM4TjnXp>
.
|
|
This looks really nice. The only bit that worries me is |
This leads to : camllight/camllight@67c1dc3#diff-d0f533866c552cc04626e6d892579278R153 which does not give any rationale for that specific heuristics. |
Sounds like a good idea. I'll try it. |
5ad3b36 to
6bab982
Compare
|
@stedolan Done. |
|
@damiendoligez Should this new API also be used for i/o channels? |
|
@alainfrisch: I/O buffers consume both one file descriptor (a system resource, perhaps more appropriate for the old API?) and an out-of-heap buffer (a memory resource, very appropriate for the new API). So, it's a tie. This said I'm very happy to see the new API. If the experts are OK with it I suggest merging in trunk ASAP. |
It don't think it's the case. The life-time of the file descriptor does not depend on the GC: it can and must be released explicitly (with close), and the GC finaliser won't touch it (no implicit close). So as far as I can tell, the channel object is equivalent to a pure memory buffer, as far as the GC is concerned. |
|
(About i/o channels: with an extra indirection, one could even destroy (free) the buffer when the channel is (explicitly) closed, but it's tricky to make this thread-safe.) |
|
@alainfrisch you're right, the FD part of the channel is unmanaged, and the memory part is a good fit for the new API, so I'm going to convert it too. Don't merge before I do that. |
4ab5946 to
d278ab1
Compare
|
Why just not use Here is a small test case: (it requires 4.07) |
That API is harder to use (it mandates the use of finalizers) and never caught on. It doesn't let the user control the ratio of heap to external memory. |
d278ab1 to
2dd846b
Compare
|
As far as I can tell, this PR has not been fully reviewed (and not approved), and @damiendoligez indicates that it is ready. @stedolan, would you maybe be willing to do a review? |
2dd846b to
ac908e4
Compare
|
( @alainfrisch or @jhjourdan might be qualified to review as well. ) |
alainfrisch
left a comment
There was a problem hiding this comment.
LGTM, only cosmetic comments.
@damiendoligez Did you test MPR#7100 against this branch?
|
You could also use the new API in win32graph/draw.c, graph/image.c and avoid magic constants in those files. It seems all other uses in the core distribution are with default values for mem/max (0/1), so could also be switched to new API (or not). |
|
I'm not sure about Win32, but on X Window, the memory held by a Pixmap is in the server, not in the user process, so I don't think it makes sense to use the new API there. For the other uses, you're right that they all use 0/1 so converting them wouldn't change anything. |
|
I tried the test program of MPR#7100 and indeed with this PR it doesn't fail. |
|
Note: it will be time to revisit #1961 after this PR is merged and tested in the wild for some time. |
(Actually, I'm confused on whether #1476 already fixed that one.) |
I don't think it did. I tried the test program with 4.07.1 and it failed. |
|
CI passed, so I'm merging. Thanks for the review! |
|
Other tickets more or less directly related to this PR: https://caml.inria.fr/mantis/view.php?id=7198 (I'm not sure it's worth mentioning them in the Changelog, but I'll let @damiendoligez decide.) |
|
This PR resolves MPR#7198 but the others will also need changes in other software. I'll add it to Changes. |
|
I added a mention of MPR#7198 (3f46a1d). The others are not directly fixed by this PR. |
This PR is related to #1476 and MPR#7750.
It adds one function to the runtime and three new GC parameters.
The function (
caml_alloc_custom_mem) is to be used by C code to allocate custom objects that consist of a small block in the heap pointing to a large amount of memory outside the heap (typically allocated withmalloc). This function takes only one size parameter, the number of bytes of memory outside the heap used by this custom block.The user can then control how much of this memory is retained by dead blocks because the GC does not immediately collect dead blocks ("floating garbage"). This is done with three new GC parameters:
Mcustom_major_ratioThis is a percentage relative to the major heap size. A complete GC cycle will be done every time 2/3 of that much memory is allocated for blocks in the major heap. Assuming constant allocation and deallocation rates, this means there are at most (M/100 * major-heap-size) bytes of floating garbage at any time. The factor of 2/3 (or 1.5) is (roughly speaking) because the major GC takes 1.5 cycles (previous cycle + marking phase) before it starts to deallocate dead blocks allocated during the previous cycle.mcustom_minor_ratioThis is a percentage relative to the minor heap size. A minor GC will be forced when the memory allocated for blocks in the minor heap reaches this value.ncustom_minor_max_sizeThis is a number of bytes. Blocks allocated in the minor heap that are over this size are assumed to be long-lived and counted immediately against the major heap rather than the minor heap. This is to avoid large allocations forcing a minor GC every time.The function is then used to allocate bigarrays, allowing us to get rid of the infamous
CAML_BA_MAX_MEMORYconstant.When applied to the test program of MPR#7750, the results are pretty good: we can make the process size vary from about 6GB (with M=10) to 15 GB (with M=100) to 25 GB (with M=200).
needs to be done: documentation