Fixes to #8691 by jhjourdan · Pull Request #9027 · ocaml/ocaml

jhjourdan · 2019-10-09T11:28:51Z

Thanks to @gadmm, we found a bug in #8691 (cf #8691 (comment)): async callbacks were not checked systematically in caml_alloc_small_dispatch, despite our desire.

Fixing this will be especially important once #8805 will be merged, since caml_alloc_small_dispatch will be the only entry point for handling async callbacks in native mode. Without this PR patch but with #8805, signal handling in native mode will be delayed until the next minor collection.

In addition, this PR fixes two other less important issues in #8691:

it gets rid of caml_memprof_to_do and caml_final_to_do, which was used in old versions of the PR Guarantee that allocation functions do no trigger callbacks when called from C #8691,
it makes the async callback mechanism use young_limit both in native and bytecode modes. This makes it more uniform, and since we are checking for async callbacks in caml_alloc_small_dispatch, there is no reason not to set young_limit when there is one pending.

jhjourdan · 2019-10-09T11:29:07Z

Cc @gadmm, @stedolan.

gadmm

Looks good to me. Using young_limit uniformly is nice, masking will rely on that. A small question again about memprof_track_young below.

You mention caml_final_to_do but I didn't see it appear in the patches.

This PR is needed for #8805 but technically not for 4.10 given that the current behaviour, although it is not the one desired, is not problematic. However, if we want to avoid delaying #8993, I would need to know if the current PR is going to be merged very very soon because I'll need to rebase on it. (Same remark for the reversion in semantics of check_urgent_gc which is due for 4.10 in all cases and that I'll need to do separately if #8993 was to be delayed.)

runtime/minor_gc.c

jhjourdan · 2019-10-09T13:21:45Z

You mention caml_final_to_do but I didn't see it appear in the patches.

Ah right. I forgot to push that.

gadmm · 2019-10-09T13:24:14Z

You mention caml_final_to_do but I didn't see it appear in the patches.

Ah right. I forgot to push that.

look good to me

gadmm · 2019-10-09T14:44:12Z

A behaviour I noticed before but which is repeated with this patch: in long-running allocating C functions, since allocations do not reset caml_something_to_do, if a signal arrives or a finaliser is pending, then all subsequent small allocations are made slow because they will all poll and then decide to not do anything (until the callbacks are finally executed). Is this tradeoff accepted?

My implementation of masking tries hard to avoid that so maybe we can fix this in the future with masking.

stedolan · 2019-10-09T15:13:49Z

This patch looks good to me. Well spotted/fixed!

It worries me that there is no automated testing of this. I don't think this should be too hard to write a test case for: do something which causes a callback to become pending, and then allocate and assert that the callback runs before the next GC cycle. There's an example of such a test in testsuite/tests/c-api/alloc_async.

A behaviour I noticed before but which is repeated with this patch: in long-running allocating C functions, since allocations do not reset caml_something_to_do, if a signal arrives or a finaliser is pending, then all subsequent small allocations are made slow because they will all poll and then decide to not do anything (until the callbacks are finally executed). Is this tradeoff accepted?

My implementation of masking tries hard to avoid that so maybe we can fix this in the future with masking.

FWIW, I think this overhead is fine. If there's a simple way to avoid it, then that would be nice, but I don't think it's worth spending much effort or runtime complexity on.

jhjourdan · 2019-10-09T15:18:34Z

A behaviour I noticed before but which is repeated with this patch: in long-running allocating C functions, since allocations do not reset caml_something_to_do, if a signal arrives or a finaliser is pending, then all subsequent small allocations are made slow because they will all poll and then decide to not do anything (until the callbacks are finally executed). Is this tradeoff accepted?

Well, yes, I thought that should be the price to pay. Do you have figures in order to evaluate the slowdown?

My implementation of masking tries hard to avoid that so maybe we can fix this in the future with masking.

Still, we would need to enter caml_alloc_small_dispatch, which sometimes requires registering extra roots... That still has some costs. Again, figures would help. But that's an independent discussion.

jhjourdan · 2019-10-09T15:21:03Z

It worries me that there is no automated testing of this. I don't think this should be too hard to write a test case for: do something which causes a callback to become pending, and then allocate and assert that the callback runs before the next GC cycle. There's an example of such a test in testsuite/tests/c-api/alloc_async.

Then, this test would only be relevant in native mode with #8805 because the bytecode interpreter checks caml_something_to_do directly. And in native mode, I am afraid that the test will fail on some platforms, because young_limit may be cached in a register...

stedolan · 2019-10-09T15:27:49Z

Then, this test would only be relevant in native mode with #8805 because the bytecode interpreter checks caml_something_to_do directly

Good point. We should wait until #8805 is merged before adding this test.

And in native mode, I am afraid that the test will fail on some platforms, because young_limit may be cached in a register...

This is true for finalisers but not for signals, so the test should use signals. When a signal arrives, the register caching young_limit is updated from the signal handler. I think this is the right behaviour, actually: it's OK to delay a finaliser until the next GC cycle since finalisers are always delayed by a GC cycle, but signals should be handled promptly.

Still, we would need to enter caml_alloc_small_dispatch, which sometimes requires registering extra roots... That still has some costs. Again, figures would help. But that's an independent discussion.

(Agreed that this is an independent discussion, but if you're investigating this you can use CAMLdrop in the style of #8993 (comment) to avoid the overhead of registering roots in the fast case)

gasche · 2019-10-09T15:34:51Z

I'm all for merging this (@stedolan, I am assuming that you also approve of the patch?), but I think that indeed writing a test would be very nice if it's not too much work.

stedolan · 2019-10-09T15:43:22Z

I'm all for merging this (@stedolan, I am assuming that you also approve of the patch?), but I think that indeed writing a test would be very nice if it's not too much work.

I agree!

Then, this test would only be relevant in native mode with #8805 because the bytecode interpreter checks caml_something_to_do directly

Thinking about this again, isn't this only true for Memprof callbacks? Signal callbacks work fine before #8805, so a test using those in native code should fail before this PR and pass afterwards.

gadmm · 2019-10-09T16:04:05Z

Then, this test would only be relevant in native mode with #8805 because the bytecode interpreter checks caml_something_to_do directly

Thinking about this again, isn't this only true for Memprof callbacks? Signal callbacks work fine before #8805, so a test using those in native code should fail before this PR and pass afterwards.

Currently (until #8805), in native, caml_garbage_collection correctly calls signal handlers both before and after this PR. IMO this is less of a bugfix than finishing implementing a feature that will only become visible once #8805 lands, in that sense a test isn't strictly necessary, and I agree with @jhjourdan if he says that it's complicated to test.

gadmm · 2019-10-09T16:26:18Z

A behaviour I noticed before but which is repeated with this patch: in long-running allocating C functions, since allocations do not reset caml_something_to_do, if a signal arrives or a finaliser is pending, then all subsequent small allocations are made slow because they will all poll and then decide to not do anything (until the callbacks are finally executed). Is this tradeoff accepted?

Well, yes, I thought that should be the price to pay. Do you have figures in order to evaluate the slowdown?

My implementation of masking tries hard to avoid that so maybe we can fix this in the future with masking.

Still, we would need to enter caml_alloc_small_dispatch, which sometimes requires registering extra roots... That still has some costs. Again, figures would help. But that's an independent discussion.

I agree that it's a separate discussion, and I do not have figures. It is very dependent on the user's code. If the user follows the advice of regularly calling a process_pending_* function, then they are safe from it. The trick with masking in order to avoid entering caml_alloc_small_dispatch is simply not to change young_limit when the mask is on. You remember that a callback is pending with caml_something_to_do, and you use it to set the young_limit when you turn off the mask. I think the same principle can apply for this problem, but we can discuss it again over at #8961 (once it is ready again for review after I rebase it).

jhjourdan · 2019-10-10T08:24:41Z

Currently (until #8805), in native, caml_garbage_collection correctly calls signal handlers both before and after this PR. IMO this is less of a bugfix than finishing implementing a feature that will only become visible once #8805 lands, in that sense a test isn't strictly necessary, and I agree with @jhjourdan if he says that it's complicated to test.

Actually, I think we can write a test, which would fail before this PR in bytecode. Just give me a little more time for that!

… call to [caml_alloc_small_dispatch].

…patch], we can use [young_limit] to check for signals both in bytecode and native modes.

jhjourdan · 2019-10-11T09:22:25Z

I have just added a test which checks that signals are indeed handled at every allocation in the minor heap from OCaml code. I took care of not calling any function after raising the signal, so that we are sure that the signal is not handled by the other mechanism in interp.c. The test indeed fails without the current commit.

gadmm · 2019-10-11T13:02:28Z

Thanks!

gadmm approved these changes Oct 9, 2019

View reviewed changes

runtime/minor_gc.c Outdated Show resolved Hide resolved

runtime/minor_gc.c Show resolved Hide resolved

runtime/minor_gc.c Show resolved Hide resolved

jhjourdan force-pushed the fix_8691 branch from 119edb3 to f009e29 Compare October 9, 2019 13:21

jhjourdan force-pushed the fix_8691 branch from f009e29 to f49a733 Compare October 9, 2019 13:31

stedolan approved these changes Oct 9, 2019

View reviewed changes

jhjourdan force-pushed the fix_8691 branch 2 times, most recently from 44553d0 to f2f36c9 Compare October 11, 2019 09:15

jhjourdan added 5 commits October 11, 2019 11:17

Remove dead variables [caml_memprof_to_do] and [caml_final_to_do].

1e31beb

Make sure async callbacks and urgent GC requests are called for every…

6027c9e

… call to [caml_alloc_small_dispatch].

Now that we always check for async callbacks in [caml_alloc_small_dis…

71d302a

…patch], we can use [young_limit] to check for signals both in bytecode and native modes.

Add test for ocaml#9027.

edb358c

Update Changes.

7db098b

jhjourdan force-pushed the fix_8691 branch from f2f36c9 to 7db098b Compare October 11, 2019 09:19

gasche merged commit 72869a3 into ocaml:trunk Oct 11, 2019

stedolan mentioned this pull request Jun 29, 2020

EINTR-based signals, again #9722

Merged

1 task

stedolan mentioned this pull request Jul 27, 2020

Ensure signals are handled before Unix.kill returns #9802

Merged

anmolsahoo25 pushed a commit to anmolsahoo25/ocaml that referenced this pull request Aug 25, 2020

Disable signals_alloc test until we implement ocaml#9027

f57caae

sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Jan 10, 2022

Disable signals_alloc test until we implement ocaml#9027

f16bc4c

gasche mentioned this pull request Jan 19, 2023

statmemprof is absent in OCaml 5.0 #11911

Closed

Conversation

jhjourdan commented Oct 9, 2019

Uh oh!

jhjourdan commented Oct 9, 2019

Uh oh!

gadmm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jhjourdan commented Oct 9, 2019

Uh oh!

gadmm commented Oct 9, 2019

Uh oh!

gadmm commented Oct 9, 2019

Uh oh!

stedolan commented Oct 9, 2019

Uh oh!

jhjourdan commented Oct 9, 2019

Uh oh!

jhjourdan commented Oct 9, 2019

Uh oh!

stedolan commented Oct 9, 2019

Uh oh!

gasche commented Oct 9, 2019

Uh oh!

stedolan commented Oct 9, 2019

Uh oh!

gadmm commented Oct 9, 2019

Uh oh!

gadmm commented Oct 9, 2019

Uh oh!

jhjourdan commented Oct 10, 2019

Uh oh!

jhjourdan commented Oct 11, 2019

Uh oh!

gadmm commented Oct 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants