-
Notifications
You must be signed in to change notification settings - Fork 1.6k
add a join_background_tasks method to Host{In,Out}putStream and WasiCtx. #6823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This method flushes output and terminates background tasks. Background
tasks still terminate as part of Drop!
The problem with the current implementation is that there is no way to
wait for output buffered in host background tasks to flush before
aborting those tasks as part of dropping the Store/Table. This means that
e.g. for a trivial component that prints "hello world\n" to stdout and
returns, if the Store&Table drop immediately after execution of the
component completes, there is a race and the output may not happen at
all.
I don't really love this design, but we got backed into a corner because
all of the alternatives we could think up are worse:
* We can't just get rid of the abort-on-drop completely ("daemonize" the
tasks) because that means that streams that are connected to e.g. a
stalled client connection will consume resources forever, which is not
acceptable in some embeddings.
* We can't ensure flushing on drop of a table/store because it requires
an await, and rust does not have an async drop
* We can't add an explicit destructor to a table/store which will
terminate tasks, and if this destructor is not called tasks will
"daemonize", because that means cancellation of the future executing
a component before the explicit destructor is called will end up
daemonizing the task.
* We could configure all AsyncWriteStreams (and any other stream impls
that spawn a task) at creation, or at insertion to the table, with
whether they should daemonize on drop or not. This would mean plumbing a
bunch of config into places it currently is not.
Therefore, the only behavior we could come up with was to keep
the abort-on-drop behavior for background tasks, and add methods
to ensure that background tasks are joined (finished) gracefully.
This means that both sync and async users of WasiCtx will need to
call the appropriate method to wait on background tasks. This is
easy enough for users to miss, but we decided that the alternatives are
worse.
Closes #6811
|
Waiting to merge until
|
|
I don't see a discussion of the alternative approach where background tasks are daemonized by default but there is a method that returns a future which completes when all background tasks are done or cancels the background tasks on Such an approach seems like the ideal approach since it doesn't risk I/O operations not flushing and easily allows users to opt into ensuring that resources are not leaked. If this is really just a question of which default is better (best effort of I/O flushing vs protecting against potential resource leaks), I would personally pick I/O flushing. The approach in this PR seems like it will be a footgun that a lot of users will run into even if the docs are all updated to mention it. |
|
Oh that's discussed a bit in @pchickey's points above, but I can to help clarify some more. I do realize though that yesterday in our sync we ended up concluding daemonize-by-default was the right option, and both @pchickey and I feel the same way but the code ended up not working out too well. One of the main problems with that is that for embeddings where cancel-by-default is desired then that option needs to be active for the entire lifetime of One idea would be to make this an option on Given all that the conclusion we reached was to go the opposite way and cancel-on-drop rather than daemonize and require these methods to be invoked to ensure output reaches its destination. I don't disagree it's a footgun though but I think that fixing that may need to happen at a different level rather than daemonize-or-cancel perhaps. One example there was what I mentioned yesterday where |
|
That's fine. I do think it would go a long way to make blocking writes only return once wasmtime has handed them off to the underlying host kernel. I believe we discussed the possibility of adding a call to |
|
@pchickey I think there's one more block to handle as well (same as input streams) |
That would make blocking write into blocking write + flush, but only work for the one implementation today where ready also means flush. I don't think we want to make every blocking write include a flush, since the flush could hang if the other end of the pipe stops listening, and I also don't want to make ready mean flush on all writers, which limits the flexibility for future implementations. |
This commit adds a new operation to `AsyncWriteStream` which flushes the output of the underlying writer. This will invoke the underlying `.flush()` to ensure that the flush request is propagated all the way through the host implementation and Tokio bits. The `join_background_tasks` method is then updated to call `flush` and discard its result. I've opted to do a bit of refactoring to avoid a lot of duplication in the new `flush` method with the preexisting methods, so there's a bit of movement here too.
This builds on the prior commit to redefine the `join_background_tasks` as purely a "flush" operation that flushes all output streams. This no longer attempts to gracefully join output tasks one-by-one since that should no longer be necessary and everything will get cleaned up during drop when `abort` calls are invoked.
|
My previous comment was incorrect as that's related to input streams, not output streams, which is what this cares about. I've additionally pushed up some more commits after some more discussion with @pchickey. The thinking is that it's pretty likely at this point for WASI to gain a "flush" on So I've updated The thinking afterwards, to solve your concern @rylev, is that WASI will gain |
| /// This will block the current thread up to the `timeout` specified, or | ||
| /// forever if `None` is specified, until all wasm output has been flushed | ||
| /// out. | ||
| pub fn sync_join_background_tasks<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be sync_flush_output?
| } | ||
|
|
||
| // Make sure the background task does not outlive the AsyncWriteStream handle. | ||
| // In order to make sure output is flushed before dropping, use `join_background_tasks`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // In order to make sure output is flushed before dropping, use `join_background_tasks`. | |
| // In order to make sure output is flushed before dropping, use `flush_output`. |
| /// explicitly wait for this. | ||
| /// | ||
| /// In some embeddings, a misbehaving client might cause this graceful exit to await for an | ||
| /// unbounded amount of time, so we recommend bounding this with a timeout or other mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do wonder if this should also take a timeout just like the sync version. This might encourage users to do the right thing and it has the added benefit of being symetrical with the sync version.
| fn send(&mut self, bytes: Bytes) -> anyhow::Result<(usize, StreamState)> { | ||
| use tokio::sync::mpsc::error::TrySendError; | ||
| fn send(&mut self, msg: WriteMesage) -> anyhow::Result<bool> { | ||
| assert!(!self.has_pending_op()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is this necessary if you have to check for the pending op anyway down below?
|
I know that various bca people are on the same page but for onlookers: this is just too sticky of an issue to hack on a fix in the host like this, and we identified other issues with output-stream, so we are going to redesign the way output-stream does backpressure to make this no longer be a problem. |
This method flushes output and terminates background tasks. Background tasks still terminate as part of Drop!
The problem with the current implementation is that there is no way to wait for output buffered in host background tasks to flush before aborting those tasks as part of dropping the Store/Table. This means that e.g. for a trivial component that prints "hello world\n" to stdout and returns, if the Store&Table drop immediately after execution of the component completes, there is a race and the output may not happen at all.
I don't really love this design, but we got backed into a corner because all of the alternatives we could think up are worse:
Therefore, the only behavior we could come up with was to keep the abort-on-drop behavior for background tasks, and add methods to ensure that background tasks are joined (finished) gracefully. This means that both sync and async users of WasiCtx will need to call the appropriate method to wait on background tasks. This is easy enough for users to miss, but we decided that the alternatives are worse.
Closes #6811