Skip to content

crimson/osd: Support balance reads#49116

Merged
Matan-B merged 19 commits intoceph:mainfrom
Matan-B:wip-matanb-c-balanced-reads
Mar 7, 2023
Merged

crimson/osd: Support balance reads#49116
Matan-B merged 19 commits intoceph:mainfrom
Matan-B:wip-matanb-c-balanced-reads

Conversation

@Matan-B
Copy link
Contributor

@Matan-B Matan-B commented Nov 29, 2022

Fix TestLibRBD.TestIO failure with multiple OSDs by supporting balance reads.

  • Skip do_recover_missing() on replicas.
  • Implement PG::replica_clear_repop_obc() - Reload or remove invalid objects contexts from the registry.
  • Enable thrash/workloads/small-objects-balanced/localized tests.
  • Make use of Client Request's pipeline for sequencing RepRequest.
  • on_change() to erase watchers before changing state from primary to replica.
  • Remove head member from ObjectContext

Fixes: https://tracker.ceph.com/issues/58089

TODO:

Blocked-by:

Test:

OSD=3 cluster (Test conf is set to rbd_read_from_replica_policy : balance) :

CRIMSON_COMPAT=true RBD_FEATURES= bin/ceph_test_librbd
  --gtest_filter=TestLibRBD.TestIO
seed 3608180 
Note: Google Test filter = TestLibRBD.TestIO   
[==========] Running 1 test from 1 test suite.  
[----------] Global test environment set-up. 
[----------] 1 test from TestLibRBD 
[ RUN      ] TestLibRBD.TestIO 
...

[       OK ] TestLibRBD.TestIO (66898 ms)
[----------] 1 test from TestLibRBD (66898 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (67227 ms total)
[  PASSED  ] 1 test.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@athanatos
Copy link
Contributor

Does ceph_test_rados have a way of sending balanced reads? We should probably enable it in the crimson tests if so.

@Matan-B Matan-B changed the title crimson/osd: Support rbd balanced reads crimson/osd: Support balanced reads Nov 30, 2022
@Matan-B Matan-B changed the title crimson/osd: Support balanced reads crimson/osd: Support balance reads Nov 30, 2022
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch 5 times, most recently from f609187 to acbd881 Compare December 6, 2022 14:25
@github-actions github-actions bot added the common label Dec 7, 2022
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch 2 times, most recently from 67fdba6 to 08891c7 Compare December 7, 2022 15:58
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 08891c7 to 04fc955 Compare December 11, 2022 14:05
@github-actions github-actions bot added the tests label Dec 11, 2022
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 04fc955 to 4abd88b Compare December 11, 2022 15:57
@Matan-B Matan-B marked this pull request as ready for review December 12, 2022 09:38
@Matan-B Matan-B requested a review from a team as a code owner December 12, 2022 09:38
@Matan-B Matan-B requested a review from athanatos December 12, 2022 09:38
@Matan-B
Copy link
Contributor Author

Matan-B commented Dec 12, 2022

jenkins test make check

@Matan-B Matan-B removed the DNM label Dec 12, 2022
!pg->is_degraded_or_backfilling_object(soid)) {
if (pg->is_unreadable_object(soid, &ver)){
logger().error("{} {} is unreadable", __func__, soid);
//TODO:: wait for unreadable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we leaving a TODO here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here was to separate the wait_for_unreadable_object() and wait_for_degraded_object() cases, same as classic does (using<hobject_t, std::list<OpRequestRef>> maps).
In crimson there isn't any differentiation between those, we simply "wait for recovery" in both cases.
In second thought, I will leave the current implementation as is and drop this commit since it's out of the scope of this PR.

pp().recover_missing
).then_interruptible([this] {
return do_recover_missing(pg, get_target_oid());
if (pg->is_primary()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's possible to have an internal_client_request on a replica -- confirm that, add an assert to the start of the op, and add it to the comment at the top of the class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//if (is_primary() && last != logv.rend()) {
//projected_log.skip_can_rollback_to_to_head();
//projected_log.trim(cct, last->version, nullptr, nullptr, nullptr);
//}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No projected_log concept in Crimson's pg yet. It might be easier to track down the missing pieces when grepping these comments while implementing (in similarity to BackfillState::Enqueuing::maybe_update_range()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into a #if 0 block

@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 4abd88b to 9653504 Compare December 13, 2022 13:18
@Matan-B Matan-B requested a review from athanatos December 13, 2022 13:25
@Matan-B
Copy link
Contributor Author

Matan-B commented Mar 5, 2023

@rzarzynski, re-based with latest main (w/ merged SnapTrim PR).
Since head member was removed from obc class in this PR and SnapTrimObjSubEvent::remove_or_update already made use of it - the following (new) commit is a fix for this usage: b92905b

@Matan-B Matan-B requested review from a team and rzarzynski March 5, 2023 10:39
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 3d2bdc0 to 2104695 Compare March 5, 2023 15:07
Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than the nits

Matan-B added 3 commits March 7, 2023 08:11
…on replica

* assert internal_client_request is on primary since
  do_recover_missing is also called by internal requests.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
In the case of balanced read the op is not misdirected.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 2104695 to 8803508 Compare March 7, 2023 08:54
Matan-B added 8 commits March 7, 2023 08:57
* Move error checking to the beginning of do_process()

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
…ed/localized

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Clear invalid obc from cache.

Fixes: https://tracker.ceph.com/issues/58089

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Wait on a PG to advance to the request's map epoch.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 8803508 to 93c719b Compare March 7, 2023 08:57
Matan-B added 7 commits March 7, 2023 09:04
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
discard op in the case where same_primary_since is later than
the MOSDOp's map epoch

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Before this patch, ObjectContext had a head member which was used
to get the head obc of a clone object.
This member caused the head object to being referenced while
attempting to 'clear_replica_obc' (Since we only evict un-referenced
obc from the obc_registery).
This mechanism, of obtaining the head, is no longer needed since
'with_clone_obc' loads the head object context first (using
oid.get_head).

In this commit, head is removed from ObjectContext class and users
are removed as well.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
In continuation to 7ca2690:
Now that the head ref is no longer a member of obc, we need a new
substitute way to get the head when needed.

When loading a clone object, the head object is loaded
first (See with_clone_obc). Therefore we can make use of this design
to move the loaded head forward to the relevant func (See with_head_and_clone_obc).
Usually, we wouldn't need to make use of both the head and the clone obc in the
same function. However, SnapTrimObjSubEvent::remove_or_update is an abnormal usage.

Note: We want to avoid holding any unneeded references to obcs
to allow the obc_registery to evict no longer valid obc.
Therefore, with_obc() which references only a single obc is the
preferred entry point for loading obcs.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-c-balanced-reads branch from 93c719b to b1632db Compare March 7, 2023 09:09
@Matan-B
Copy link
Contributor Author

Matan-B commented Mar 7, 2023

@Matan-B
Copy link
Contributor Author

Matan-B commented Mar 7, 2023

jenkins test api

@Matan-B Matan-B merged commit 5841654 into ceph:main Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants