Skip to content

osd/../client_request: Fix clone replicated reads#62535

Merged
Matan-B merged 5 commits intoceph:mainfrom
Matan-B:wip-matanb-crimson-replicated-missing
Apr 8, 2025
Merged

osd/../client_request: Fix clone replicated reads#62535
Matan-B merged 5 commits intoceph:mainfrom
Matan-B:wip-matanb-crimson-replicated-missing

Conversation

@Matan-B
Copy link
Contributor

@Matan-B Matan-B commented Mar 27, 2025

When handling a client request on a clone object, m->get_hobj() might
not actually represent the "actual" object from the OSD's prespective.
See: ObjectContext::resolve_oid for how this logic is handled.

If this replica has any missing clone objects that are either this object head
or any of this object's clones - drop this replicted read and bounce back to primary.

Keep the head missing check in order to avoid trying to find any
missing clones if this is not a clone.

We could possibly check if the head is not missing and then wait for
resolve_oid to lookup for the actual clone object (Which might exist!).
However, this seems to not be worth the complexity as the niche scenario
of being able to serve reads while having (possibly) releated missing objects.

Fixes (3/3): https://tracker.ceph.com/issues/70639

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@Matan-B Matan-B requested a review from a team as a code owner March 27, 2025 11:00
@Matan-B Matan-B added this to Crimson Mar 27, 2025
@Matan-B Matan-B moved this to Awaits review in Crimson Mar 27, 2025
pg->get_perf_logger().inc(l_osd_replica_read_redirect_conflict);
co_await reply_op_error(pg, -EAGAIN);
co_return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea -- let's instead change can_serve_replica_read to simply check for any head or clone of the object and leave it where it was.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we have 2 options here:

  1. Move can_serve_replica_read to check the resolved object.
  2. Check any head or clone of the object

For 2, we would need to iterate over each possible head/clone combination (similar to the previous version of this PR). Am I right?
IMO, checking only the resolved object seems more elegant, unless I'm missing a better way to implement the 2nd option above.


Note: Classic seems to call can_serve_replica_read for the unresolved object. I'll address this in a followup PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One downside with moving can_serve_replica_read is that we check for writes much later in the flow, meaning we could have bounce this read earlier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting 2, but you can do it with one pass by simply ignoring the clone field, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking again, checking writes on the clone actually checks writes on the head only. This actually makes sense as clone objects can't have any independent log entries. In any log entry of a clone (which is a creation of it or deletion/update of it when trimming) there must be also a separate log entry of the head object.

I kept the can_resolve in the original place of it - see last commit for clarification notes. What do you think?

@Matan-B Matan-B force-pushed the wip-matanb-crimson-replicated-missing branch from bb65713 to 3673f06 Compare March 30, 2025 10:49
@Matan-B Matan-B requested a review from a team as a code owner March 30, 2025 10:49
@Matan-B Matan-B requested a review from athanatos March 30, 2025 10:49
@Matan-B Matan-B force-pushed the wip-matanb-crimson-replicated-missing branch 2 times, most recently from e9ce2fc to a1ae3b7 Compare March 30, 2025 10:56
@Matan-B
Copy link
Contributor Author

Matan-B commented Apr 1, 2025

jenkins test make check

@Matan-B
Copy link
Contributor Author

Matan-B commented Apr 1, 2025

jenkins test make check arm64

Matan-B added 5 commits April 2, 2025 10:28
When checking if a client requested object is missing, we must first
obtain the actual (resolved) object from the OSD prespective. See how
Crimson handles this logic with resolve_oid (Classic does the same).
For us to resolve an object we *must* obtain its object context.

There are some scenarios (e.g serving replicated reads) where checking
if the object (head or clone) is missing prior to obtaining the obc
could be useful to avoid further lookups or unnecessarily complicated/
bug-prone pathways.

With this function, we can check if any head or clone of this object
is missing, without resolving the (client) requested object.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
When handling a client request on a clone object, m->get_hobj() might
not actually represent the "actual" object from the OSD's prespective.
See: ObjectContext::resolve_oid for how this logic is handled.

If this replica has any missing clone objects that are either this object head
or any of this object's clones - drop this replicted read and bounce back to primary.

We could possibly check if the head is not missing and then wait for
resolve_oid to lookup for the actual clone object (Which might exist!).
However, this seems to not be worth the complexity as the niche scenario
of being able to serve reads while having (possibly) releated missing objects.

Fixes (2/2): https://tracker.ceph.com/issues/70639

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
For every clone object operation, we call recover_missing_snaps
to make sure that the resolved object is not missing.
For us to resolve_oid we need to lock the obc for RWREAD.
See ClientRequest::recover_missing_snaps.

Let's avoid locking the object if I don't have any possible
missing clone objects. Otherwise, try to recover as before.

This should positively affect reads even in an undegraded cluster state.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
can_serve_replica_read uses PGLog::has_write_since, when checking for
writes we actually check if any pglog entry belongs to the head object.

The only two pglog entries that are of a clone object are:
1) At creation (pg_log_entry_t::CLONE)
2) At trimming (See remove_or_update)

In both cases, the there would be another pg log entry of the head.

---

Add assertion in prepare_head_update to assert that the above is true.
The obc passed to prepare_head_update (by OpsExecuter) could also be
a clone object (after being resolved). However, write operations should
only occur to head - so let's verify that.

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-crimson-replicated-missing branch from a1ae3b7 to 480b7c6 Compare April 2, 2025 10:30
@Matan-B
Copy link
Contributor Author

Matan-B commented Apr 2, 2025

@Matan-B
Copy link
Contributor Author

Matan-B commented Apr 3, 2025

jenkins test make check

@Matan-B
Copy link
Contributor Author

Matan-B commented Apr 3, 2025

jenkins test api

@Matan-B Matan-B merged commit 7150576 into ceph:main Apr 8, 2025
12 checks passed
@Matan-B Matan-B moved this from Awaits review to Merged in Crimson Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Merged (Pre Tentacle Freeze)

Development

Successfully merging this pull request may close these issues.

2 participants