rgw: switch back to boost::asio for spawn() and yield_context#55592
rgw: switch back to boost::asio for spawn() and yield_context#55592
Conversation
e1b1277 to
593002f
Compare
593002f to
6ce8b0b
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
6ce8b0b to
b3beeec
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
| @@ -122,7 +122,6 @@ namespace { | |||
| class Waiter { | |||
| using Signature = void(boost::system::error_code); | |||
There was a problem hiding this comment.
I should probably make an attempt to get rid of this class entirely, but that's not urgent.
There was a problem hiding this comment.
currently, it is only used to yield a coroutine.
so, for starter, we can remove the mutex+cond var code.
|
i tried rebasing but saw lots of errors building new d4n stuff. i'll have to work through that too |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
the qat async initiator functions were based on async_completion<> and its completion_handler member, but the updated boost::asio::yield_context doesn't provide a completion_handler. switch to the updated async_initate() method which does work with boost::asio::yield_context Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
a fork of boost::asio::spawn() was introduced in 2020 with spawn::spawn() from ceph#31580. this fork enabled rgw to customize how the coroutine stacks are allocated in order to avoid stack overflows in frontend request coroutines. this customization was based on a StackAllocator concept from the boost::context library in boost 1.80, that same StackAllocator overload was added to boost::asio::spawn(), along with other improvements like per-op cancellation. now that boost has everything we need, switch back and drop the spawn submodule this required switching a lot of async functions from async_completion<> to async_initiate<>. similar changes were necessary to enable the c++20 coroutine token boost::asio::use_awaitable Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
d448be7 to
79a6459
Compare
|
jenkins test api |
|
jenkins test make check |
@cbodley the kafka test failure in the above is a regression with this PR (although it is hard to tell as a similar failure existed before). last call will hang. |
| }, token, yield.get_executor()); | ||
| l.unlock(); | ||
| init.result.get(); |
There was a problem hiding this comment.
@yuvalif i think the regression is probably due to this change. the unlock() should happen before suspend, so probably needs to be inside the lambda
There was a problem hiding this comment.
this change was done here: #54697
and was backported to reef and quincy.
so, we probably need to make this chnage there as well.
somewhat related, we may have an race condition in the http client:
https://github.com/ceph/ceph/blob/main/src/rgw/rgw_http_client.cc#L73
we don't use the lock to protect the "done" flag. maybe this is the reason for: https://tracker.ceph.com/issues/66033 ?
There was a problem hiding this comment.
interesting. done is atomic, but it does look like wait() could still race with the finish() on another thread. i wonder why we haven't seen this anywhere else
i pushed a commit cbodley@4eab016 to lock before the done check but test_notification_push_http still fails the same way
There was a problem hiding this comment.
it could be that there is another issue. however, isn't it better to fix that regardless?
a fork of
boost::asio::spawn()was introduced in 2020 withspawn::spawn()from #31580. this fork enabled rgw to customize how the coroutine stacks are allocated in order to avoid stack overflows in frontend request coroutines. this customization was based on a StackAllocator concept from the boost::context libraryin boost 1.80, that same StackAllocator overload was added to
boost::asio::spawn(), along with other improvements like per-op cancellation. now that boost has everything we need, switch back and drop the spawn submodulethis required switching a lot of async functions from
async_completion<>toasync_initiate<>. similar changes were necessary to enable the c++20 coroutine tokenboost::asio::use_awaitableShow available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e