common/async: spawn_group and parallel_for_each() by cbodley · Pull Request #50005 · ceph/ceph

cbodley · 2023-02-06T16:27:13Z

adds a spawn_group template class for fork-join parallelism with boost::asio::awaitable<void> coroutines, and builds a parallel_for_each() algorithm (modeled after seastar::parallel_for_each) on top

the main use for rgw multisite is where we spawn a coroutine for each log shard, then wait for all of them to complete

the existing ceph::async::co_throttle from #49720 can be used for this, but it's interface was designed for bounded concurrency, so is less convenient to use in the unbounded fork-join case: the caller has to co_await each spawn separately and handle potential errors from other coroutines. with spawn_group, errors are reported through a single co_await on the wait() member function

asio provides a generic fork-join algorithm in asio::experimental::make_parallel_group(), but it requires the group size to be known at compile time. for multisite, the shard counts are variable and come from ceph.conf

spawn_group example

awaitable<void> child(task& t);

awaitable<void> parent(std::span<task> tasks)
{
  // process all tasks in parallel
  auto ex = co_await boost::asio::this_coro::executor;
  auto group = spawn_group{ex, tasks.size()};
  for (auto& t : tasks) {
    boost::asio::co_spawn(ex, child(t), group);
  }
  co_await group.wait();
}

parallel_for_each() example

awaitable<void> child(task& t);

awaitable<void> parent(std::span<task> tasks)
{
  co_await parallel_for_each(tasks.begin(), tasks.end(), child);
}

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows

Needed to fix coroutine detection under Clang TODO: update boost package sha1 in install-deps.sh when we have packages uploaded Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Previously we had versions of all the calls that took an int64_t and optional key and namespace. This didn't really offer much benefit and doubled the maintenance burden for changing anything. As such just make the IOContext constructor non-explicit, and make all its mutators return the same IOContext so people can supply them builder-style if they want to. Also don't wrap things in 'optional' when the underlying library doesn't. The callers just end up having to wrap and unwrap with value-or repeatedly. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

This form of completion handling is compatible with C++20 Completions and generally more flexible stuff. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

The lock will continue to be held over the 'dispatch' with C++20 coroutines. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Apparently in some cases we were using references to objects on the caller stack that happen to no longer exist if the caller's stack goes away. C++20 Coroutines tickled this bug. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Use `min` rather than `max` when deciding how much we need to read in `read()`. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Just as a demonstration to see how well they work and how to put things together with them. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

In the `destroy_` functions of `CompletionImpl` we were getting the associated allocator after moving out of the handler into the call to `bind_and_forward`. This was triggering a crash on null-pointer access in operations made with `co_composed`. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Since reading CLS values in a friendly way requires calling out to RADOS then decoding the returned structure, make a function to help with that. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Google Test does not support C++ coroutines, so kludge together a test harness that supports coroutines reasonably well. Also add a couple utility functions. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

…utines Signed-off-by: Casey Bodley <cbodley@redhat.com>

We should not be using std::list everywhere, and this is an excellent time to switch. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

If our decoder function returns a tuple of multiple values, flatten it so our signature is `void(error_code, T, U, V)` not `void(error_code, std::tuple<T, U, V>)`. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

And return them to the client by setting the error cod and result in the vector and returning an error from the operation as a whole. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Since they can be reported now, report them Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

github-actions · 2023-02-14T19:07:12Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Signed-off-by: Casey Bodley <cbodley@redhat.com>

github-actions · 2023-02-28T22:00:20Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

cbodley · 2023-03-01T00:38:32Z

merged into #49737

cbodley requested review from adamemerson and yuvalif February 6, 2023 16:27

github-actions bot added build/ops common tests labels Feb 6, 2023

cbodley mentioned this pull request Feb 10, 2023

asio: drop BOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT #50064

Closed

adamemerson approved these changes Feb 13, 2023

View reviewed changes

adamemerson and others added 23 commits February 14, 2023 12:52

build: Bump boost to 1.81

39abb3e

Needed to fix coroutine detection under Clang TODO: update boost package sha1 in install-deps.sh when we have packages uploaded Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados: Switch to async_initiate

2e9b69c

This form of completion handling is compatible with C++20 Completions and generally more flexible stuff. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados: Don't call dispatch inside with_osdmap

9d7690a

The lock will continue to be held over the 'dispatch' with C++20 coroutines. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados: Ensure we don't have dangling references

07ecb4f

Apparently in some cases we were using references to objects on the caller stack that happen to no longer exist if the caller's stack goes away. C++20 Coroutines tickled this bug. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados: Fix logic error in exerciser

2edce78

Use `min` rather than `max` when deciding how much we need to read in `read()`. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados: Use C++20 Coroutines in exerciser/demo

2a27649

Just as a demonstration to see how well they work and how to put things together with them. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

cls/version: Move obj_version printer to cls_version_types.h

e535e17

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

cls/version: Add non-default constructor

53829bd

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados/cls: Helper function for reading CLS values

08e7810

Since reading CLS values in a friendly way requires calling out to RADOS then decoding the returned structure, make a function to help with that. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

test/neorados: Harness and convenience for Neorados tests

6c18d41

Google Test does not support C++ coroutines, so kludge together a test harness that supports coroutines reasonably well. Also add a couple utility functions. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados/cls: Client for version objclass

bf69249

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

common/async: handler wrappers forward associated cancellation_slot

e63e71e

Signed-off-by: Casey Bodley <cbodley@redhat.com>

common/async: add service template for execution_context shutdown

9ce4df1

Signed-off-by: Casey Bodley <cbodley@redhat.com>

common/async: add co_throttle for bounded concurrency with c++20 coro…

5b86ebd

…utines Signed-off-by: Casey Bodley <cbodley@redhat.com>

cls/log: Switch from std::list to std::vector

1e72da3

We should not be using std::list everywhere, and this is an excellent time to switch. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

cls/log: Add non-default constructors to cls_log_entry

15ad1d7

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

cls/log: Switch from utime_t to ceph::real_time

109d073

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

cls/log: C++ namespaces exist

084c5ce

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados/cls: cls::exec handles tuples properly

6684ebb

If our decoder function returns a tuple of multiple values, flatten it so our signature is `void(error_code, T, U, V)` not `void(error_code, std::tuple<T, U, V>)`. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

osdc: Catch exceptions thrown in CLS client decoders

7e8a980

And return them to the client by setting the error cod and result in the vector and returning an error from the operation as a whole. Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

neorados/cls/version: Don't swallow exceptions

dd71051

Since they can be reported now, report them Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

adamemerson and others added 6 commits February 14, 2023 12:52

neorados/cls: Client for log objclass

0eda6bc

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>

common/async: move service out of detail

f53946d

Signed-off-by: Casey Bodley <cbodley@redhat.com>

common/async: add co_waiter class template

6aa74a0

Signed-off-by: Casey Bodley <cbodley@redhat.com>

common/async: co_throttle_impl uses co_waiter

8b037e5

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw/sync: add with_lease() with polymorphic LockClient

fe1b1f5

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw/sync: wrap lease renewal exceptions in lease_aborted

9ceab5f

Signed-off-by: Casey Bodley <cbodley@redhat.com>

adamemerson force-pushed the wip-coro-after-reef branch from 27c3c39 to 9ceab5f Compare February 14, 2023 19:06

adamemerson requested review from a team as code owners February 14, 2023 19:06

github-actions bot added the needs-rebase label Feb 14, 2023

cbodley added 3 commits February 15, 2023 15:07

common/async: add co_waiter::op_cancellation ctor for clang

1ab0b68

Signed-off-by: Casey Bodley <cbodley@redhat.com>

common/async: add spawn_group template for fork-join parallelism

3d63481

Signed-off-by: Casey Bodley <cbodley@redhat.com>

common/async: add parallel_for_each() algorithm

90a6bc7

Signed-off-by: Casey Bodley <cbodley@redhat.com>

cbodley force-pushed the wip-async-parallel-for-each branch from c74ea5a to 90a6bc7 Compare February 15, 2023 20:08

github-actions bot removed the needs-rebase label Feb 15, 2023

adamemerson force-pushed the wip-coro-after-reef branch from 9ceab5f to 88c6d6d Compare February 28, 2023 20:02

github-actions bot added the needs-rebase label Feb 28, 2023

cbodley closed this Mar 1, 2023

This was referenced Apr 26, 2024

rgw: make incomplete multipart upload part of bucket check efficient #57083

Merged

common/async: add primitives for structured concurrency with optional_yield #57188

Merged

cbodley mentioned this pull request Jun 25, 2024

doc/dev/rgw: design doc for metadata sync with c++20 coroutines #58268

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common/async: spawn_group and parallel_for_each()#50005

common/async: spawn_group and parallel_for_each()#50005
cbodley wants to merge 32 commits intoceph:wip-coro-after-reeffrom
cbodley:wip-async-parallel-for-each

cbodley commented Feb 6, 2023

Uh oh!

github-actions bot commented Feb 14, 2023

Uh oh!

github-actions bot commented Feb 28, 2023

Uh oh!

cbodley commented Mar 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cbodley commented Feb 6, 2023

spawn_group example

parallel_for_each() example

Uh oh!

github-actions bot commented Feb 14, 2023

Uh oh!

github-actions bot commented Feb 28, 2023

Uh oh!

cbodley commented Mar 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants