osd/../client_request: Fix clone replicated reads by Matan-B · Pull Request #62535 · ceph/ceph

Matan-B · 2025-03-27T11:00:27Z

When handling a client request on a clone object, m->get_hobj() might
not actually represent the "actual" object from the OSD's prespective.
See: ObjectContext::resolve_oid for how this logic is handled.

If this replica has any missing clone objects that are either this object head
or any of this object's clones - drop this replicted read and bounce back to primary.

Keep the head missing check in order to avoid trying to find any
missing clones if this is not a clone.

We could possibly check if the head is not missing and then wait for
resolve_oid to lookup for the actual clone object (Which might exist!).
However, this seems to not be worth the complexity as the niche scenario
of being able to serve reads while having (possibly) releated missing objects.

Fixes (3/3): https://tracker.ceph.com/issues/70639

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins test classic perf Jenkins Job | Jenkins Job Definition
jenkins test crimson perf Jenkins Job | Jenkins Job Definition
jenkins test signed Jenkins Job | Jenkins Job Definition
jenkins test make check Jenkins Job | Jenkins Job Definition
jenkins test make check arm64 Jenkins Job | Jenkins Job Definition
jenkins test submodules Jenkins Job | Jenkins Job Definition
jenkins test dashboard Jenkins Job | Jenkins Job Definition
jenkins test dashboard cephadm Jenkins Job | Jenkins Job Definition
jenkins test api Jenkins Job | Jenkins Job Definition
jenkins test docs ReadTheDocs | Github Workflow Definition
jenkins test ceph-volume all Jenkins Jobs | Jenkins Jobs Definition
jenkins test windows Jenkins Job | Jenkins Job Definition
jenkins test rook e2e Jenkins Job | Jenkins Job Definition

src/crimson/osd/pg.h

src/crimson/osd/osd_operations/client_request.cc

src/crimson/osd/pg.cc

athanatos · 2025-03-27T17:34:13Z

src/crimson/osd/osd_operations/client_request.cc

+      pg->get_perf_logger().inc(l_osd_replica_read_redirect_conflict);
+      co_await reply_op_error(pg, -EAGAIN);
+      co_return;
+  }


Same idea -- let's instead change can_serve_replica_read to simply check for any head or clone of the object and leave it where it was.

IIUC we have 2 options here:

Move can_serve_replica_read to check the resolved object.

Check any head or clone of the object

For 2, we would need to iterate over each possible head/clone combination (similar to the previous version of this PR). Am I right?
IMO, checking only the resolved object seems more elegant, unless I'm missing a better way to implement the 2nd option above.

Note: Classic seems to call can_serve_replica_read for the unresolved object. I'll address this in a followup PR.

One downside with moving can_serve_replica_read is that we check for writes much later in the flow, meaning we could have bounce this read earlier.

I'm suggesting 2, but you can do it with one pass by simply ignoring the clone field, right?

Looking again, checking writes on the clone actually checks writes on the head only. This actually makes sense as clone objects can't have any independent log entries. In any log entry of a clone (which is a creation of it or deletion/update of it when trimming) there must be also a separate log entry of the head object.

I kept the can_resolve in the original place of it - see last commit for clarification notes. What do you think?

src/crimson/osd/osd_operations/client_request.cc

Matan-B · 2025-04-01T12:24:53Z

jenkins test make check

Matan-B · 2025-04-01T12:25:03Z

jenkins test make check arm64

When checking if a client requested object is missing, we must first obtain the actual (resolved) object from the OSD prespective. See how Crimson handles this logic with resolve_oid (Classic does the same). For us to resolve an object we *must* obtain its object context. There are some scenarios (e.g serving replicated reads) where checking if the object (head or clone) is missing prior to obtaining the obc could be useful to avoid further lookups or unnecessarily complicated/ bug-prone pathways. With this function, we can check if any head or clone of this object is missing, without resolving the (client) requested object. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

When handling a client request on a clone object, m->get_hobj() might not actually represent the "actual" object from the OSD's prespective. See: ObjectContext::resolve_oid for how this logic is handled. If this replica has any missing clone objects that are either this object head or any of this object's clones - drop this replicted read and bounce back to primary. We could possibly check if the head is not missing and then wait for resolve_oid to lookup for the actual clone object (Which might exist!). However, this seems to not be worth the complexity as the niche scenario of being able to serve reads while having (possibly) releated missing objects. Fixes (2/2): https://tracker.ceph.com/issues/70639 Signed-off-by: Matan Breizman <mbreizma@redhat.com>

For every clone object operation, we call recover_missing_snaps to make sure that the resolved object is not missing. For us to resolve_oid we need to lock the obc for RWREAD. See ClientRequest::recover_missing_snaps. Let's avoid locking the object if I don't have any possible missing clone objects. Otherwise, try to recover as before. This should positively affect reads even in an undegraded cluster state. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

can_serve_replica_read uses PGLog::has_write_since, when checking for writes we actually check if any pglog entry belongs to the head object. The only two pglog entries that are of a clone object are: 1) At creation (pg_log_entry_t::CLONE) 2) At trimming (See remove_or_update) In both cases, the there would be another pg log entry of the head. --- Add assertion in prepare_head_update to assert that the above is true. The obc passed to prepare_head_update (by OpsExecuter) could also be a clone object (after being resolved). However, write operations should only occur to head - so let's verify that. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B · 2025-04-02T16:59:32Z

https://pulpito.ceph.com/matan-2025-04-02_14:31:37-crimson-rados-wip-matanb-crimson-replicated-missing-distro-crimson-smithi/

Matan-B · 2025-04-03T09:35:34Z

jenkins test make check

Matan-B · 2025-04-03T09:35:44Z

jenkins test api

Matan-B requested review from athanatos, jdurgin and xxhdx1985126 March 27, 2025 11:00

Matan-B requested a review from a team as a code owner March 27, 2025 11:00

Matan-B added this to Crimson Mar 27, 2025

Matan-B moved this to Awaits review in Crimson Mar 27, 2025

github-actions bot added the crimson label Mar 27, 2025

Matan-B commented Mar 27, 2025

View reviewed changes

src/crimson/osd/pg.h Show resolved Hide resolved

Matan-B commented Mar 27, 2025

View reviewed changes

src/crimson/osd/osd_operations/client_request.cc Show resolved Hide resolved

athanatos reviewed Mar 27, 2025

View reviewed changes

src/crimson/osd/pg.cc Show resolved Hide resolved

athanatos reviewed Mar 27, 2025

View reviewed changes

athanatos reviewed Mar 28, 2025

View reviewed changes

src/crimson/osd/osd_operations/client_request.cc Outdated Show resolved Hide resolved

Matan-B force-pushed the wip-matanb-crimson-replicated-missing branch from bb65713 to 3673f06 Compare March 30, 2025 10:49

Matan-B requested a review from a team as a code owner March 30, 2025 10:49

Matan-B requested a review from athanatos March 30, 2025 10:49

github-actions bot added core tests labels Mar 30, 2025

Matan-B force-pushed the wip-matanb-crimson-replicated-missing branch 2 times, most recently from e9ce2fc to a1ae3b7 Compare March 30, 2025 10:56

Matan-B added 5 commits April 2, 2025 10:28

test/types: add is_missing_any_head_or_clone_of testcase

46d0e42

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B force-pushed the wip-matanb-crimson-replicated-missing branch from a1ae3b7 to 480b7c6 Compare April 2, 2025 10:30

Matan-B added ready-to-merge wip-matan-testing labels Apr 7, 2025

athanatos approved these changes Apr 7, 2025

View reviewed changes

Matan-B merged commit 7150576 into ceph:main Apr 8, 2025
12 checks passed

Matan-B moved this from Awaits review to Merged in Crimson Apr 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd/../client_request: Fix clone replicated reads#62535

osd/../client_request: Fix clone replicated reads#62535
Matan-B merged 5 commits intoceph:mainfrom
Matan-B:wip-matanb-crimson-replicated-missing

Matan-B commented Mar 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

athanatos Mar 27, 2025

Uh oh!

Matan-B Mar 30, 2025

Uh oh!

Matan-B Mar 31, 2025

Uh oh!

athanatos Apr 1, 2025

Uh oh!

Matan-B Apr 2, 2025

Uh oh!

Uh oh!

Matan-B commented Apr 1, 2025

Uh oh!

Matan-B commented Apr 1, 2025

Uh oh!

Matan-B commented Apr 2, 2025

Uh oh!

Matan-B commented Apr 3, 2025

Uh oh!

Matan-B commented Apr 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Matan-B commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution Guidelines

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

athanatos Mar 27, 2025

Choose a reason for hiding this comment

Uh oh!

Matan-B Mar 30, 2025

Choose a reason for hiding this comment

Uh oh!

Matan-B Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

athanatos Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

Matan-B Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Matan-B commented Apr 1, 2025

Uh oh!

Matan-B commented Apr 1, 2025

Uh oh!

Matan-B commented Apr 2, 2025

Uh oh!

Matan-B commented Apr 3, 2025

Uh oh!

Matan-B commented Apr 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Matan-B commented Mar 27, 2025 •

edited

Loading