crimson/osd: optimize crimson-osd's client requests process parallelism#39772
Conversation
|
jenkins test make check |
|
jenkins test crimson perf |
|
jenkins test crimson perf |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
895356c to
68d690d
Compare
|
jenkins test crimson perf |
|
There seems to be some bugs in this PR which I'm trying to fix right now. |
68d690d to
ac65eba
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
ac65eba to
47b25f5
Compare
47b25f5 to
469d390
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
8641700 to
932ba74
Compare
932ba74 to
c309aaf
Compare
c309aaf to
6e0a594
Compare
|
@rzarzynski could you take a final look? |
| rep_op_fut_t maybe_mutated = | ||
| interruptor::make_ready_future<rep_op_fut_tuple>( | ||
| seastar::now(), | ||
| interruptor::make_interruptible(osd_op_errorator::now())); |
There was a problem hiding this comment.
does it need to initialize maybe_mutated here? it is set value in the following.
There was a problem hiding this comment.
we can improve this part in a follow-up change. i think @xxhdx1985126 is just reusing the existing code structure.
please note, maybe_mutated initialized here is used if op_effects.empty().
| return interruptor::do_for_each(op_effects, [] (auto& op_effect) { | ||
| return op_effect->execute(); | ||
| }); | ||
| return maybe_mutated.then_unpack_interruptible( |
There was a problem hiding this comment.
why not safe_then_unpack_interruptible here?
There was a problem hiding this comment.
the return type of PG::submit_transation is an interruptible_future_detail<InterruptCond, seastar::future<...>> now, it is a bit wierd here.
src/crimson/osd/pg.cc
Outdated
| const auto oid = m->get_snapid() == CEPH_SNAPDIR ? m->get_hobj().get_head() | ||
| : m->get_hobj(); | ||
| auto ox = std::make_unique<OpsExecuter>( | ||
| auto ox = seastar::make_lw_shared<OpsExecuter>( |
There was a problem hiding this comment.
why use lw_shared_ptr here?
There was a problem hiding this comment.
There are two places holding references to OpsExecuter, one being continuations following the all_completed future and the other the old one, now. The completion order of these two can not be determined at compile time. So I made this pointer a shared one.
There was a problem hiding this comment.
I noticed all places hold ox by ox.get(), so it not transfer the smart ptr, it transfer the raw ptr in ox, does it have problem by using unique_ptr?
There was a problem hiding this comment.
I think so. There may be chances that the unique_ptr gets destroyed before other raw ptrs, which would lead to memory access violations
src/crimson/osd/pg.cc
Outdated
| return PG::do_osd_ops_ertr::future<Ref<MOSDOpReply>>( | ||
| crimson::ct_error::eagain::make()); | ||
| return PG::do_osd_ops_ertr::future<rep_op_fut_t>( | ||
| crimson::ct_error::eagain::make()); |
There was a problem hiding this comment.
does it miss a seastar::now() future here? since return rep_op_fut_t it is a tuple。
There was a problem hiding this comment.
please note, we are not creating a future with its value here. we are returning a future carrying error. in this case, we don't pass a tuple or the argument for constructing a tuple to the constructor of errorator::future<>. otherwise, we'd use make_ready_future<>() instead.
| ::crimson::osd::IOInterruptCondition, T>; | ||
| using rep_op_fut_t = | ||
| std::tuple<interruptible_future<>, | ||
| interruptible_future<crimson::osd::acked_peers_t>>; |
There was a problem hiding this comment.
there are two rep_op_fut_t definition in pg.h and pg_backend.h. can we use different name to identify them?
There was a problem hiding this comment.
ok, i'll deal with it
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
6e0a594 to
e72600a
Compare
|
jenkins test make check |
These two stages are used to provide more parallelism in the pipeline, while preserving client requests order. Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
Make client requests go to the concurrent pipeline stage "wait_repop" once they are "submitted" to the underlying objectstore, which means their on-disk order is guaranteed, so that successive client requests can go into the "process" pipeline stage. Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
e72600a to
fd1ee08
Compare
dismiss my approval as the last commit does not look right.
otherwise any async execution of lambdas in PG::do_osd_ops_execute() may reference a outdated osd_op Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
fd1ee08 to
0a0848d
Compare
allow successive client requests to go into the "process" pipeline stage
once the current one is submitted to the pg backend.
Signed-off-by: Xuehan Xu xxhdx1985126@gmail.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox