Skip to content

crimson: take read lock on replica during final push commit#62288

Merged
athanatos merged 14 commits intoceph:mainfrom
athanatos:sjust/wip-crimson-recovery-69412-replica
Apr 1, 2025
Merged

crimson: take read lock on replica during final push commit#62288
athanatos merged 14 commits intoceph:mainfrom
athanatos:sjust/wip-crimson-recovery-69412-replica

Conversation

@athanatos
Copy link
Contributor

Show available Jenkins commands

@athanatos athanatos requested a review from a team as a code owner March 13, 2025 23:01
@athanatos
Copy link
Contributor Author

Short test run passed 10/10 tests, ready for inclusion in larger run. sjust-2025-03-13_18:30:47-crimson-rados-wip-sjust-crimson-testing-2025-03-13-1741879994-distro-default-smithi

@Matan-B Matan-B added this to Crimson Mar 16, 2025
@Matan-B Matan-B moved this to In Progress in Crimson Mar 16, 2025
@athanatos athanatos force-pushed the sjust/wip-crimson-recovery-69412-replica branch 2 times, most recently from 8252d06 to 854878d Compare March 17, 2025 21:03
@athanatos
Copy link
Contributor Author

jenkins test api

@athanatos
Copy link
Contributor Author

jenkins test make check arm64

@athanatos
Copy link
Contributor Author

athanatos commented Mar 18, 2025

short pg log run (5 jobs) http://pulpito.ceph.com/sjust-2025-03-18_00:08:46-crimson-rados-wip-sjust-crimson-testing-2025-03-17-1742245566-distro-default-smithi/

The above run includes another branch by accident, rebuilding a new branch wip-sjust-crimson-testing-2025-03-17-1742261352

Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one last comment which I'm trying to understand better.

@athanatos athanatos force-pushed the sjust/wip-crimson-recovery-69412-replica branch from 854878d to 192671f Compare March 18, 2025 15:47
@Matan-B Matan-B moved this from In Progress to Needs QA in Crimson Mar 18, 2025
@athanatos
Copy link
Contributor Author

https://pulpito-ng.ceph.com/runs/sjust-2025-03-18_19:25:22-crimson-rados-wip-sjust-crimson-testing-2025-03-18-1742312874-distro-default-smithi has two failures with that fix, presumably I missed something, will fix tomorrow.

@athanatos athanatos marked this pull request as draft March 19, 2025 04:08
@athanatos athanatos force-pushed the sjust/wip-crimson-recovery-69412-replica branch from 192671f to 66f4791 Compare March 19, 2025 21:00
@athanatos
Copy link
Contributor Author

athanatos commented Mar 19, 2025

Repushed version with fix for primary side missing ssc and simpler replica side. Arguably, this could be two different PRs now since the replica side doesn't actually need the refactor. Let me know if you'd rather I split it up.

See my explanation above, I think I'm right about ssc and coroutine param references but this would hardly be the first time for me to be confidently wrong :).

That approach didn't work, still thinking.

@athanatos
Copy link
Contributor Author

athanatos commented Mar 20, 2025

Ok, the primary doesn't make a point of sending the head before clones (and I don't think we want it to), and it's also possible for a clone and the head to be pushed concurrently. We need any client replica read to block until the push is readable.

I'm creating some machinery to allow a clone obc to have a null ssc, but only if obtained through a specific interface used only in ReplicatedRecoveryBackend::handle_push (so, only on a replica). Normal accesses (including everywhere else even on the replica) will go through load_and_lock_clone, which always loads the head obc and ssc first. That pathway will gain a branch at the end to fix the ssc if it's null (will add an explanation).

It's a bit less elegant that I'd hoped since it allows obc->ssc for clones on replica to be null in the cache, but I think it correctly reflects the fact that we simply may not have the head's ssc at the time of the push and the load_and_lock mechanisms should prevent other code from needing to worry about it.

@athanatos
Copy link
Contributor Author

athanatos commented Mar 20, 2025

Testing a version of the above. This is probably not worth looking at again until I get a good test run. I'll unset draft when I get to that point.

@athanatos athanatos force-pushed the sjust/wip-crimson-recovery-69412-replica branch 2 times, most recently from 50e20e7 to c0e154a Compare March 20, 2025 02:38
Calling set_state_obc twice on the same obc results in
ObjectContext::list_link_cnt being stuck above 0.  ~ObjectContext() then
crashes when destructing the obc_accessing_hook.  Add an assert as the
above can be quite time consuming to track down.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
…tine

Signed-off-by: Samuel Just <sjust@redhat.com>
…utine

Signed-off-by: Samuel Just <sjust@redhat.com>
More readable as one function now that it uses co_await.

Signed-off-by: Samuel Just <sjust@redhat.com>
…n handle_push if complete

Signed-off-by: Samuel Just <sjust@redhat.com>
pull_info.obc->ssc->exists

must be true because we just set it, and

obc->ssc->snapset.seq == pull_info.obc->ssc->snapset.seq

must be true because we just set pull_info.obc to obc.

Signed-off-by: Samuel Just <sjust@redhat.com>
… to have target_state preloaded

Certain recovery users will need to be able to directly create and lock
ObjectContext instances -- this allows the Manager machinery to work
correctly in that case.

Signed-off-by: Samuel Just <sjust@redhat.com>
@athanatos athanatos force-pushed the sjust/wip-crimson-recovery-69412-replica branch from c0e154a to f2f2e79 Compare March 22, 2025 01:40
@athanatos
Copy link
Contributor Author

athanatos commented Mar 22, 2025

@athanatos athanatos marked this pull request as ready for review March 22, 2025 01:41
Copy link
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, I'm glad we were able to resolve these obscure nuances early on!
I've left few suggestions and the only critical question here is regarding the head obc existence on the replica side.

…etadata

Recovery will use this pathway to directly insert metadata from the
attributes in the PushOp.  Subsequent accessors of those obcs in
ObjectContextLoader::load_and_lock* should simply attempt to lock
the obc rather than trying to load them from disk.

Signed-off-by: Samuel Just <sjust@redhat.com>
…ush_data

When recovering an object, we know it cannot already be cached and we
aren't interested in reading the current ondisk state.  Instead, just
create the correct obc from scratch.

Signed-off-by: Samuel Just <sjust@redhat.com>
…p helper

Signed-off-by: Samuel Just <sjust@redhat.com>
…ush_data in _handle_pull_response

Simply construct obc directly from push data.

Signed-off-by: Samuel Just <sjust@redhat.com>
We need a place to hold the obc for in-progress, multi message pushes
on the replica side in order to lock the obc during the final commit.

This could be introduced as a ReplicatedRecoveryBackend concept, but
that seems unnecessarily complicated.

Signed-off-by: Samuel Just <sjust@redhat.com>
…replica push commit

We need to block reads on any recovery object between when we mark the
object not missing and when the commit completes.  Create obc from
first push_op and use that obc during the final commit to create
an ObjectContextLoader::Manager to lock.

Signed-off-by: Samuel Just <sjust@redhat.com>
@athanatos athanatos force-pushed the sjust/wip-crimson-recovery-69412-replica branch from f2f2e79 to 482036c Compare March 24, 2025 20:36
@Matan-B
Copy link
Contributor

Matan-B commented Mar 25, 2025

jenkins test make check arm64

@athanatos
Copy link
Contributor Author

@athanatos athanatos merged commit 5d01518 into ceph:main Apr 1, 2025
12 checks passed
@Matan-B
Copy link
Contributor

Matan-B commented Apr 2, 2025

merging based on https://pulpito.ceph.com/sjust-2025-04-01_07:07:33-crimson-rados-wip-sjust-crimson-testing-2025-03-31-1743478092-distro-default-gibba/

@athanatos, I could not identify the failed job here, possibly not related. Looks like 1 pg can't get out of backfilling state, I opened a tracker: https://tracker.ceph.com/issues/70765

@Matan-B Matan-B moved this from Needs QA to Merged in Crimson Apr 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Merged (Pre Tentacle Freeze)

Development

Successfully merging this pull request may close these issues.

2 participants