Skip to content

crimson/osd: optimize crimson-osd's client requests process parallelism#39772

Merged
tchaikov merged 4 commits intoceph:masterfrom
xxhdx1985126:wip-crimson-client-req-pipeline-parallelism
May 18, 2021
Merged

crimson/osd: optimize crimson-osd's client requests process parallelism#39772
tchaikov merged 4 commits intoceph:masterfrom
xxhdx1985126:wip-crimson-client-req-pipeline-parallelism

Conversation

@xxhdx1985126
Copy link
Contributor

@xxhdx1985126 xxhdx1985126 commented Mar 2, 2021

allow successive client requests to go into the "process" pipeline stage
once the current one is submitted to the pg backend.

Signed-off-by: Xuehan Xu xxhdx1985126@gmail.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@xxhdx1985126 xxhdx1985126 requested a review from a team as a code owner March 2, 2021 04:35
@xxhdx1985126
Copy link
Contributor Author

jenkins test make check

@tchaikov
Copy link
Contributor

tchaikov commented Mar 2, 2021

jenkins test crimson perf

@tchaikov
Copy link
Contributor

tchaikov commented Mar 2, 2021

jenkins test crimson perf

@ceph ceph deleted a comment from xxhdx1985126 Mar 2, 2021
@ceph ceph deleted a comment from xxhdx1985126 Mar 2, 2021
@github-actions
Copy link

github-actions bot commented Mar 2, 2021

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from 895356c to 68d690d Compare March 3, 2021 02:50
@xxhdx1985126
Copy link
Contributor Author

jenkins test crimson perf

@xxhdx1985126 xxhdx1985126 changed the title crimson/osd: optimize crimson-osd's client requests process parallelism [DNM]crimson/osd: optimize crimson-osd's client requests process parallelism Mar 8, 2021
@xxhdx1985126
Copy link
Contributor Author

There seems to be some bugs in this PR which I'm trying to fix right now.

@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from 68d690d to ac65eba Compare March 8, 2021 08:06
@github-actions
Copy link

github-actions bot commented Mar 8, 2021

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from ac65eba to 47b25f5 Compare March 11, 2021 05:23
@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from 47b25f5 to 469d390 Compare March 11, 2021 08:30
@xxhdx1985126 xxhdx1985126 changed the title [DNM]crimson/osd: optimize crimson-osd's client requests process parallelism crimson/osd: optimize crimson-osd's client requests process parallelism Mar 11, 2021
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from 8641700 to 932ba74 Compare May 11, 2021 01:48
@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from 932ba74 to c309aaf Compare May 11, 2021 04:05
@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from c309aaf to 6e0a594 Compare May 12, 2021 15:34
tchaikov
tchaikov previously approved these changes May 12, 2021
@tchaikov
Copy link
Contributor

@rzarzynski could you take a final look?

rep_op_fut_t maybe_mutated =
interruptor::make_ready_future<rep_op_fut_tuple>(
seastar::now(),
interruptor::make_interruptible(osd_op_errorator::now()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it need to initialize maybe_mutated here? it is set value in the following.

Copy link
Contributor

@tchaikov tchaikov May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can improve this part in a follow-up change. i think @xxhdx1985126 is just reusing the existing code structure.

please note, maybe_mutated initialized here is used if op_effects.empty().

return interruptor::do_for_each(op_effects, [] (auto& op_effect) {
return op_effect->execute();
});
return maybe_mutated.then_unpack_interruptible(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not safe_then_unpack_interruptible here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the return type of PG::submit_transation is an interruptible_future_detail<InterruptCond, seastar::future<...>> now, it is a bit wierd here.

const auto oid = m->get_snapid() == CEPH_SNAPDIR ? m->get_hobj().get_head()
: m->get_hobj();
auto ox = std::make_unique<OpsExecuter>(
auto ox = seastar::make_lw_shared<OpsExecuter>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use lw_shared_ptr here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xxhdx1985126 i have the same question.

Copy link
Contributor Author

@xxhdx1985126 xxhdx1985126 May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two places holding references to OpsExecuter, one being continuations following the all_completed future and the other the old one, now. The completion order of these two can not be determined at compile time. So I made this pointer a shared one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed all places hold ox by ox.get(), so it not transfer the smart ptr, it transfer the raw ptr in ox, does it have problem by using unique_ptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. There may be chances that the unique_ptr gets destroyed before other raw ptrs, which would lead to memory access violations

return PG::do_osd_ops_ertr::future<Ref<MOSDOpReply>>(
crimson::ct_error::eagain::make());
return PG::do_osd_ops_ertr::future<rep_op_fut_t>(
crimson::ct_error::eagain::make());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it miss a seastar::now() future here? since return rep_op_fut_t it is a tuple。

Copy link
Contributor

@tchaikov tchaikov May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please note, we are not creating a future with its value here. we are returning a future carrying error. in this case, we don't pass a tuple or the argument for constructing a tuple to the constructor of errorator::future<>. otherwise, we'd use make_ready_future<>() instead.

::crimson::osd::IOInterruptCondition, T>;
using rep_op_fut_t =
std::tuple<interruptible_future<>,
interruptible_future<crimson::osd::acked_peers_t>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two rep_op_fut_t definition in pg.h and pg_backend.h. can we use different name to identify them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i'll deal with it

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from 6e0a594 to e72600a Compare May 15, 2021 10:39
@xxhdx1985126
Copy link
Contributor Author

jenkins test make check

These two stages are used to provide more parallelism in the pipeline,
while preserving client requests order.

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
Make client requests go to the concurrent pipeline stage "wait_repop" once they
are "submitted" to the underlying objectstore, which means their on-disk order
is guaranteed, so that successive client requests can go into the "process"
pipeline stage.

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from e72600a to fd1ee08 Compare May 16, 2021 15:24
@tchaikov tchaikov dismissed their stale review May 16, 2021 16:23

dismiss my approval as the last commit does not look right.

otherwise any async execution of lambdas in PG::do_osd_ops_execute() may
reference a outdated osd_op

Signed-off-by: Xuehan Xu <xxhdx1985126@gmail.com>
@xxhdx1985126 xxhdx1985126 force-pushed the wip-crimson-client-req-pipeline-parallelism branch from fd1ee08 to 0a0848d Compare May 17, 2021 13:51
@tchaikov tchaikov merged commit 0b795da into ceph:master May 18, 2021
@xxhdx1985126 xxhdx1985126 deleted the wip-crimson-client-req-pipeline-parallelism branch April 23, 2024 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants