Refactor how IoContext aborts are implemented#4165
Conversation
f8f8b6a to
482bf4d
Compare
|
The internal build will fail until internal PR 10559 is merged to improve the output gate test. That PR can be merged first; it doesn't depend on this PR, it just enables this PR. |
482bf4d to
6f4aa13
Compare
6f4aa13 to
c7df745
Compare
|
Just rebased. Unfortunately there's one test in the internal codebase that does depend on that error message so I had to go open a PR for that... |
irvinebroque
left a comment
There was a problem hiding this comment.
There were various race where an aborted context might throw some exception at the client that was not the abort exception, such as throwing "This promise will never resolve" or "The script will never generate a response". Many of these should be fixed now to actually propagate the abort exception.
This is awesome — is there a basic example we could use in a changelog entry?
Want to highlight to customers that we are fixing things that result in opaque errors for them.
5b82ef4 to
b7b0431
Compare
Will make subsequent changes easier.
Prior to this, `abort()` was just one one way of fulfilling the abort promise, but other ways were joined into the promise in the constructor. Confusingly, this meant that `abortException` was not set correctly when aborting for any of those other reasons. Now the constructor calls `abortWhen()` to wait on each of those promises that were previously joined in. When any promise passed to `abortWhen()` eventually rejects, `abort()` is called.
I can't think of any reason why this was done in a different place, so let's consolidate.
I couldn't find any existing test.
I've always wanted to do this. Node's process.exit() already figured it out, so this copies its technique. Relatedly, we can have `runImpl()` rethrow the correct abort exception, yay.
This means they don't need the `finalize()` mechanism anymore. Also this means we don't need to separately join with `onAbort()` in `WorkerEntrypoint::request()`, becaues the thing `ServiceWorkerGlobalScope::request()` returns is already an `awaitJs()` promise.
Note: This has nothing to do with GC finalization. This was sort of an alternative abort mechanism where we could cause all promises exported from the IoContext to reject when we detected a hang via PendingEvents. We can just use `abort()` instead.
It doesn't really make sense that we'd null it out specifically after a hang but not in other cases. Also remove the seemingly-irrelevant nulling out of `promiseContextTag` after a hang.
Fixes the errors in our test.
This should prevent lingering tasks from failing later due to there being no IncomingRequest. I'm pretty sure the only way that happens is when the actor was aborted.
There are now two methods: * js.terminateNextExecution(), which has the previous behavior. * js.terminateExecutionNow(), which forces the termination to happen synchronously and always throws. This lets us take the hack that had proliferated to three different places and keep it in one place.
b7b0431 to
44d3004
Compare
|
|
||
| // HACK: This has been observed to reliably make V8 check the termination flag and raise the | ||
| // uncatchable termination exception. | ||
| jsg::check(v8::JSON::Stringify(v8Context(), str())); |
There was a problem hiding this comment.
That's an interesting (and weird, as always in V8) finding. Is there any V8 doc or discussion about it?
There was a problem hiding this comment.
Nope. It was just the first thing that worked and we didn't spend too much time thinking about it. Probably there are many V8 APIs we could call which should cause it to check the termination flag.
Of course, it would be better to add an explicit API to V8 to ask it to check the flag but that takes time and there's so much to do. :) If this stops working the tests will tell us.
There was a problem hiding this comment.
Yeah, having an explicit API would definitely be ideal, though I imagine it wouldn’t be easy to get into V8. Still, I’m impressed you guys found this workaround — thanks for sharing it!
IoContext abort logic evolved gradually and, much like evolution, the result was a mess.
This PR attempts to improve this.
Note that this is based on top of #4161; please review that first, and I'll keep this one as a draft in the meantime.
Aside from cleaning things up, this fixes two bugs:
IoContext::abortExceptionwas only set for aborts caused by someone explicitly callingIoContext::abort(), but many kinds of aborts (such as input gate and output gate failures) did not come in through this path. So, the previous attempt to makerun()rethrow the abort exception if called post-abort did not always work. Now, all kinds of aborts callabort(), so the exception propagates properly.expected !incomingRequests.empty()errors later when they tried to re-enter the isolate.