Skip to content

Implement quality treatment for asynchronous actions in multicore (1/N)#11095

Merged
sadiqj merged 6 commits intoocaml:trunkfrom
gadmm:multicore_async_actions_1
Apr 13, 2022
Merged

Implement quality treatment for asynchronous actions in multicore (1/N)#11095
sadiqj merged 6 commits intoocaml:trunkfrom
gadmm:multicore_async_actions_1

Conversation

@gadmm
Copy link
Copy Markdown
Contributor

@gadmm gadmm commented Mar 7, 2022

Here is a first batch of commits from #11057 following @sadiqj's review.

No change entry needed.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 7, 2022

Some tests were failing for probably non-trivial reasons, so I had to cut this PR to the first 4 commits. The rest will need to be included with some of the subsequent commits in a separate PR.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 7, 2022

After trying to have only the first 4 commits, there are still tests failing concerning the polling mechanism in the intermediate state. So we will have to consider a larger set of commits after all. I have now included commits up to cfbbf6f8, i.e. the first 14 commits of #11057.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 10, 2022

The poll.ml test might indicate a bug which I have to look at the next time I have some time.

@gadmm gadmm force-pushed the multicore_async_actions_1 branch from c9bbd34 to de50733 Compare March 12, 2022 16:05
@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 12, 2022

Re poll.ml. There was an off-by-one error when setting young_limit to young_end. What happens is that the comparison young_ptr < young_end can fail if young_ptr == young_end (e.g. polling point simulating an allocation of size 0). I wonder if this issue is already in OCaml 4 after addition of the new polling points.

I have strengthened the test so that it showed the failure more often.

@gadmm gadmm force-pushed the multicore_async_actions_1 branch from de50733 to ee2e3d6 Compare March 12, 2022 16:18
@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 12, 2022

@sadiqj I have now fixed the bug in this PR and the other one and expect green CI (except for the unneeded change entry). Let's try to get a small set in, you have essentially already reviewed this one.

achieves the proper synchronisation. */
atomic_exchange(&dom_st->young_limit, (uintnat)dom_st->young_start);
if (caml_incoming_interrupts_queued()
|| atomic_load_explicit(&dom_st->requested_external_interrupt,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs caml_check_pending_signals. Aside from allowing for asynchronous actions follow-ups, the value from this reorganisation should be that signals don't get delayed if they arrive at an inopportune time.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, caml_update_young_limit as implemented here will keep young_limit in interrupt-triggering state if a STW request is still pending, or a "requested external interrupt". On the other hand, signals and explicit GC requests are not checked -- even if some are pending, young_limit will be reset to young_start. I'm not sure I understand the reasoning for behaving differently on these different types of interruptions -- a documentation comment would help.

@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Mar 13, 2022

Thanks @gadmm . Unless I'm missing something ee2e3d6 isn't necessary here.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 14, 2022

It is up to you. I can add more commits from the full PR. I do not think this PR introduces bugs and I can quickly follow up with a second PR with some of the commits that you wanted we go slower with.

@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Mar 18, 2022

It is up to you. I can add more commits from the full PR. I do not think this PR introduces bugs and I can quickly follow up with a second PR with some of the commits that you wanted we go slower with.

Can we add the caml_check_pending_signals check in caml_reset_young_limit and drop ee2e3d6 ? At that point I'm happy to approve and we can move on to the other changes in a separate PR.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Mar 18, 2022

Notice that the check for caml_check_pending_signals requires the two preceding commits from the other PR (and thus incidentally also ee2e3d6).

I'll go and drop ee2e3d6.

@gadmm gadmm force-pushed the multicore_async_actions_1 branch from ee2e3d6 to 0e75143 Compare March 18, 2022 14:37
@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Apr 3, 2022

Notice that the check for caml_check_pending_signals requires the two preceding commits from the other PR (and thus incidentally also ee2e3d6).

Apologies, I missed this reply. I may be misunderstanding but I'm not sure why the preceding commits are necessary for the caml_check_pending_signals check in caml_reset_young_limit here:

ocaml/runtime/domain.c

Lines 1222 to 1224 in 0e75143

if (caml_incoming_interrupts_queued()
|| atomic_load_explicit(&dom_st->requested_external_interrupt,
memory_order_relaxed)) {

This would avoid the situation we have currently where I think it's possible to race a signal, overwrite young limit and delay its processing until the next GC interrupt.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 3, 2022

At this stage this would cause a loop during GC. The preceding commits change the loop, making it possible to progress even if signals are not treated immediately (see their commit log).

@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Apr 3, 2022

At this stage this would cause a loop during GC. The preceding commits change the loop, making it possible to progress even if signals are not treated immediately (see their commit log).

Got it, thanks. The loop in caml_garbage_collection means we'll continually retry. I'm happy with 3ef40cd and bd1b202 , they essentially restore the existing logic from 4.x .

With those two additional commits my understanding is that this PR would get us back to not raising exceptions when we enter the GC from C. Is that correct?

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 3, 2022

Unfortunately not. This is the commit Delay finalisers and reimplement [caml_process_pending_actions_exn]. I can quickly follow-up with a PR that contains it (along with the signal change).

@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Apr 7, 2022

5e5fa00 looks good too, should we pull that in as well? It'd be nice to tie up the async exception safe work in one PR.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 7, 2022

No, this will require earlier commits. I do not think that we are going to change the order of the commits, it works as it is and it is not easy to do.

@gadmm gadmm force-pushed the multicore_async_actions_1 branch from 0e75143 to df7958b Compare April 7, 2022 15:08
@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 7, 2022

Just rebased.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 9, 2022

I'm trying to have a look at this PR. (I wanted to see if my recent foraging into runtime code made me more able to review this sort of work; it looks like I was overly optimistic about that). The first two commits are fine but the third commit is already quite technical :-)

So, about commit d365a92 : caml_garbage_collection ends up with a loop that handles callbacks (what would be done by caml_alloc_small_dispatch in 4.14), two questions about this:

  1. the loop (unchanged in your commit) contains a check of the form young_ptr - whsize <= young_limit, why is it not < young_limit here, as in the rest of the runtime?
  2. why is there not a fence inside the loop? I thought the idea was to push the fence immediately after the (successful) young_limit checks

runtime/domain.c Outdated
atomic_store_rel(&domain_state->young_limit,
(uintnat) domain_state->young_start);
domain_state->young_ptr = domain_state->young_end;
caml_update_young_limit();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this change, it is now running some pretty involved logic at a point where we know we cannot be interrupted, and we are just initializing the minor heap state.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The principle of caml_update_young_limit is that you can call it whenever you want since it always sets the young_limit to its necessary value. Actually you must use it to do any modification to the young_limit, because otherwise you can race with an interrupt and miss it (later on with signals).
This has nothing to do with triggering an interrupt, which would be a bigger problem if there was a risk of it happening in this part of code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well caml_update_young_limit probably assumes something about the state of the rest of the runtime (that various bits have been initialized correctly), and it's not obvious that these assumptions hold in this code that is in the process of setting up the minor GC. For example, in the current runtime code the code path here is called before the following later initialization code:

  domain_state->requested_major_slice = 0;
  domain_state->requested_minor_gc = 0;
  domain_state->requested_external_interrupt = 0;

Are we sure that your code can run with those fields uninitialized? (Or possibly reused from a previous domain, etc.)

On the other hand, the code we are discussing allocate_minor_heap can also be called mid-program (when resizing the minor heaps), so the problem of a signal coming at this time should indeed be considered. Still, I think that the more correct approach is to delay this caml_update_young_limit call until the domain is in a well-defined state, and set it more simply (possibly using a dedicated caml_init_young_limit function if you want to keep visibility over all writes, and possibly with NULL, etc., your choice) in the meantime.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that your code can run with those fields uninitialized? (Or possibly reused from a previous domain, etc.)

You probably can (it is always fine to force an allocation failure spuriously). However certainly unintended (and UB) to rely on uninitialised memory. I'll have a look.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The domain state is allocated with calloc, so memory is initialized in every case. Therefore I keep the call to caml_reset_young_limit to serve the other code paths.

Upon recycling some domain state, the interrupt variables are reinitialized later than the current call to caml_reset_young_limit. This is harmless, however it is cleaner to add an additional call at the end of domain initialization, as in the new commit.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 9, 2022

Looking at the redefinition of caml_update_young_limit requires reading some pretty tricky synchronization code that is not clearly documented anywhere in the codebase (unless I missed something?). Here is what I understand so far -- this is mostly about understanding what the current code in trunk does.

First, "interrupt" can mean one of three distinct things (in a subset relation).

  1. STW interrupts, which correspond exactly to STW requests coming from another domain. This is the notion of "interrupt" in caml_send_interrupt (used only for STW interrupts) and interruptor.interrupt_pending (which is true exactly when a STW interrupt is pending)
  2. service interrupts (I just made up the name), a broader class of interrupts including STW interrupts, but also signals, explicit GC requests from the user, "requested external interrupts" coming from threads, and later memprof probes. This is the notion of "interrupt" used in interrupt_domain and caml_interrupt_self (which is called by all service interrupts that are not STW interrupts; in particular, STW interrupts are the only cross-domain interrupts).
  3. gc interrupts, an even broader class that includes service interrupts but also the normal/synchronous event of running out of minor heap space. This is the notion of "interrupt" used in Caml_check_gc_interrupt and caml_handle_gc_interrupt.

GC interrupts are checked through a single test in the hot path of the mutators, which compares young_ptr against young_limit. They are therefore always notified by mutating young_limit.

Copy link
Copy Markdown
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new round of comments and questions based on the terminology and classification I propose in #11095 (comment).

atomic_store_rel(s->interrupt_word, (uintnat)(-1));
}

int caml_incoming_interrupts_queued(void)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this function checks for STW interrupts, but not other service interrupts.

CAML_EV_BEGIN(EV_INTERRUPT_GC);
if (atomic_load_acq(young_limit) == INTERRUPT_MAGIC) {
/* interrupt */
if (caml_incoming_interrupts_queued()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, there is a change of behavior here. Before, the test would check for any service interrupt, now we only check for STW interrupts. This makes sense because the call to caml_handle_incoming_interrupts within the conditional, if I understand correctly, only handle STW interrupts anyway.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, except there is no change of behaviour for the reason you explain.

}

caml_update_young_limit();
caml_poll_gc_work();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the conditional above only handles STW interrupts, then all other forms of interrupts should be handled by caml_poll_gc_work, or will not be handled by this function. (This is a bit weird, and ideally there would be some documentation of which interrupts are intended to be handled where.)

From reading the code of caml_poll_gc_work, my impression is that:

  • synchronous minor-gc requests are serviced, as the name suggests
  • explicit GC requests from users are serviced
  • "requested external interrupts" (tick thread stuff) are serviced
  • but signals are not serviced at this point

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see if the added documentation on the other PR is good enough

achieves the proper synchronisation. */
atomic_exchange(&dom_st->young_limit, (uintnat)dom_st->young_start);
if (caml_incoming_interrupts_queued()
|| atomic_load_explicit(&dom_st->requested_external_interrupt,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, caml_update_young_limit as implemented here will keep young_limit in interrupt-triggering state if a STW request is still pending, or a "requested external interrupt". On the other hand, signals and explicit GC requests are not checked -- even if some are pending, young_limit will be reset to young_start. I'm not sure I understand the reasoning for behaving differently on these different types of interruptions -- a documentation comment would help.

runtime/domain.c Outdated
CAML_EV_END(EV_INTERRUPT_REMOTE);
}

caml_update_young_limit();
Copy link
Copy Markdown
Member

@gasche gasche Apr 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the logic here.

I think that what's going on is that we know we received an interrupt, we want to service it (except signals?), and "silence" the interrupt, but we also want to behave correctly if a new interrupt comes while we are servicing the first one. (At least we don't want to discard the notification for this second interrupt.)

I see broadly two approaches to do this:

  1. We could discard the interrupt notification right away (by setting young_limit to young_start), before doing anything, and then service all interrupts received at this point. If we decide to not service an interrupt, we probably need to restore the notification (by setting young_limit again to a triggering value). If new interrupts come in the meantime, they will notify themselves to be handled next time.
  2. Or we could service interrupts (without touching young_limit`), and then before leaving discard the interrupt notification unless there are new interrupts queued.

Approach (1) feels more robust to me, but I think both could work in theory.

Before this commit, this function was clearly following approach (1): the INTERRUPT_MAGIC test would catch all service interrupts and the conditional would start by resetting young_limit and then handling the STW interrupts (only); and then the caml_poll_gc_work call would deal with other service interrupts and normal GC work. (Signal interrupts would get dropped, which may be unintended?)

After this commit, the code seems to do a weird mix of (1) and (2):

  • first we check for STW interrupts and service them, without resetting young_limit, and only then do we call caml_update_young_limit; this sounds like approach (2) for STW interrupts.
  • but after caml_update_young_limit we handle all other gc interrupts, this sounds like approach (1) for non-STW interrupts.

This sounds more complex than necessary. Am I missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From @gadmm explanation of caml_update_young_limit's intended specification in #11095 (comment), I suspect that there is a third approach which he may be considering: if each sort of interrupt is backed by an update to some other runtime state than the young limit, and young_limit is just an efficiency mechanism (it is meant to be equivalent to checking all the other parts in turn), then young_limit is not itself "the notification" as I had assumed earlier, and its value does not matter as long as it is correctly resynchronized with the rest of the runtime state once we are done dealing with interrupts. So there is no need for a global choice between approaches (1) and (2) hinted above, we can let each handler for a category of interrupt deal with its own runtime state, and then refresh the young limit the handlers are done.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, reading the whole diff, I can see that another call to update young_limit was added in this PR at the end of caml_poll_gc_work, suggesting that this third way is indeed the general approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what I meant in the commit log of the 4th commit (young_limit does not carry information). I think this is how Jacques-Henri meant caml_update_young_limit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understood your commit message as saying that there is no need to check against a specific INTERRUPT_MAGIC value, instead of any value above the next young pointer.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

As far as PR #11057 is concerned, the meaningful difference is between delayable actions that may run OCaml code, and non-delayable actions which may not run OCaml code (documented here: 9d83eda#diff-93ee9a0b40685be35da1909f57cba82cfe51d5a8de495947e5068a8b8d8172cbR247-R277). I agree that some remaining terminology in the code would benefit from being clarified. Let's do this afterwards?

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

Re < vs. <=. I was working from the pre-multicore code, so I had not noticed that. I guess it was fixed by Jacques-Henri when he made the corresponding change in 4.12.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 11, 2022

The 4.14 code is architected very differently, but caml_alloc_dispatch_small has

ocaml/runtime/minor_gc.c

Lines 531 to 534 in bfb4b1e

/* Now, there might be enough room in the minor heap to do our
allocation. */
if (Caml_state->young_ptr - whsize >= Caml_state->young_trigger)
break;

which I think corresponds to a < check (negated) indeed.

On the other hand, continuing down this line of thinking: Multicore uses max_int (I mean (uintnat)(-1)) to set the young_limit trigger, and yet max_int is not in fact an allocatable part of the minor heap. So maybe it is intentional to move from < checks to <= checks? (But then was the move done everywhere?) In that case I would expect the default choice to be young_start - 1 rather than young_start. (Note: I guess Multicore is using max_int instead of young_end because young_end is a domain-local variable so it is less clear that it can be read safely from another domain for cross-domain interrupts?)

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

But then was the move done everywhere?

I do not think there is such a meaningful change intended. amd64.S still uses jb (<).

On the other hand your comment reminds me of the off-by-one bug I mentioned at #11095 (comment). If young_limit == young_end then you might succeed at allocating even though it signals an interrupt. As I mentioned, there was no zero-size allocation in older OCamls, so it was fine to use young_end but after adding polling locations as zero-size allocations, it is possible that OCaml 4 had a bug there like a previous version of my PR. This is fixed in multicore by taking an even greater value of young_limit (and it is indeed much simpler to set the same constant value in cross-domain interrupts). But the comparison remains < in any case.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 11, 2022

So if I am following this right, we both agree that there is a small bug here, but it's not in your code, so of course it's up to you to touch it or not, in this PR or another.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

I do not think it is a bug (< vs. <=), it is just being overly conservative.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 11, 2022

I re-read the whole diff, instead of per-commit, and I think I now understand what's going on and agree that the proposed changes are correct. But I'm still nervous about this call to caml_reset_young_limit in the middle of allocate_minor_heap, it smells weird and I could not convince myself that it is correct in not-fully-initialized situations (such as when called from create_domain). I would like to better understand this last point before approving, and the ball is in @gadmm's court.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

Re: strategy.

It is difficult to structure a PR for such a large change.

There is not much more to explain about what happens in each commit in the commit logs (they really do not do much each); what is missing from the commit logs is perhaps the overall picture (for this I have some documentation in code in later commits).

This is made more complicated by the fact that the reference for this change is the pre-multicore code. It is a bit as if you were reviewing my rebasing of the old code on top of multicore. I recommend getting sense of the other PR first, with this point in mind. This might be easier for people familiar with the pre-multicore code.

I do think, though, that the commits are individually sound-enough that a reviewer can make sure that no bug or regression is introduced. This is why I think that this prefix PR, though not necessary, can be reviewed independently on a correctness basis once you are convinced that it is good to get back closer to the pre-multicore mechanism.

In particular for design comments on individual commits, I recommend cross-checking with the final state of the PR to see if a later commit does not resolve the issue.

CAML_EV_END(EV_INTERRUPT_REMOTE);
}

caml_reset_young_limit(Caml_state);
Copy link
Copy Markdown
Member

@gasche gasche Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the whole diff, it's subtle that there is a call to caml_reset_young_limit here and also one at the end of caml_poll_gc_work called just below. Naively I wondered if we could keep just the reset at the end.

Here is my understanding of what is going on:

  • It would be wrong to just remove the reset here, because it would make caml_poll_gc_work empty the minor heap, even in the case where the limit was triggered only due to a STW interrupt and no GC work is in fact necessary.
  • But we could remove the reset if there was a call to caml_reset_young_limit at the end of caml_handle_incoming_interrupts. From a distance it sounds like that would be clearer than the current code. @gadmm, why did you add the reset here instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is there because the next line can still run OCaml code (finalisers) at this point. It is removed entirely in a later commit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'm still not sure why it's not in caml_handle_incoming_interrupts instead, but it doesn't really matter for transitory code, it's correct either way.

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

If I understand correctly, caml_update_young_limit as implemented here will keep young_limit in interrupt-triggering state if a STW request is still pending, or a "requested external interrupt". On the other hand, signals and explicit GC requests are not checked -- even if some are pending, young_limit will be reset to young_start. I'm not sure I understand the reasoning for behaving differently on these different types of interruptions -- a documentation comment would help.

For signals it comes in a later PR. For requested minor & major, this comes from the fact that this is indeed only requested synchronously, and caml_reset/update_young_limit is only called after they had an occasion of being serviced (AFAIR). However this is more restrictive than how I advertised caml_reset_young_limit—that you can call it anytime. Would you like me to add a check for requested minor & major there?

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 11, 2022

Would you like me to add a check for requested minor & major there?

I think that it would be a good idea if there is no performance impact. It makes the whole thing easier to reason about and specify (young_limit is the conjunction of young_start and all interruption sources). But if you deal with signals in a later PR, it is of course fine to deal with GC requests in another PR as well.

(Or: we could implement synchronous user GC requests with another mechanism that does not try to pass as an asynchronous interrupt, for example by calling an inner function directly. That alternative approach would also make the whole thing more consistent and easier to specify.)

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 11, 2022

Or: we could implement synchronous user GC requests with another mechanism that does not try to pass as an asynchronous interrupt, for example by calling an inner function directly. That alternative approach would also make the whole thing more consistent and easier to specify.

I am not sure what you mean. It is a request that is meant to be delayed until the subsequent safe point.

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 12, 2022

It is a request that is meant to be delayed until the subsequent safe point.

Sorry, that was wrong indeed. I thought (wrongly) that caml_request_minor_gc is only used as a primitive invoked by the user through Gc.minor(). In fact it is also used in the runtime in several places where GC pressure piles up. (Overflowing remembered set, custom allocation pressure.)

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 12, 2022

(At this point I'm again in the state where my question on initialization is the only remaining one, that I know of, before approval.)

@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Apr 12, 2022

If you've got a Changes @gadmm I'm happy to merge now @gasche has taken a look too.

gadmm added 4 commits April 13, 2022 19:22
The output has been disassembled (gcc -O2, arch=x86-64) to confirm
that `CAMLunlikely` works as intended and produces equivalent code as
before.
This function ensures that young_limit is always reset to its desired
value, without risk of race.

Also avoid relying on an information-carrying value for young_limit.
Aim to take all requests into account to simplify reasoning

Reset young_limit after reinitialisation of interrupt variables
at least to me

(the absence of race in the assertion depended on who calls it)
@gadmm gadmm force-pushed the multicore_async_actions_1 branch from 751a6ae to 80b8432 Compare April 13, 2022 17:23
@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 13, 2022

Thanks, I have cleaned-up the history.

I'll go with a no-change-entry-needed tag for this one, since I cannot think of a discernible change in behaviour it introduces.

@sadiqj sadiqj merged commit ec01245 into ocaml:trunk Apr 13, 2022
@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Apr 13, 2022

Thanks for your perseverance with this @gadmm !

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 13, 2022

Thanks !

@gadmm
Copy link
Copy Markdown
Contributor Author

gadmm commented Apr 13, 2022

I think that you have inadvertently squashed the history.

@sadiqj
Copy link
Copy Markdown
Contributor

sadiqj commented Apr 13, 2022

I think that you have inadvertently squashed the history.

Oh dear, I did. Sorry. I also noticed the tests had failed on trunk. I don't think that's related though (it seems to be the intermittent weak lifetime parallel failure. Odd that it happened on two of the testsuites though..)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants