crimson: take read lock on replica during final push commit#62288
crimson: take read lock on replica during final push commit#62288
Conversation
|
Short test run passed 10/10 tests, ready for inclusion in larger run. sjust-2025-03-13_18:30:47-crimson-rados-wip-sjust-crimson-testing-2025-03-13-1741879994-distro-default-smithi |
8252d06 to
854878d
Compare
|
jenkins test api |
|
jenkins test make check arm64 |
|
The above run includes another branch by accident, rebuilding a new branch wip-sjust-crimson-testing-2025-03-17-1742261352 |
Matan-B
left a comment
There was a problem hiding this comment.
lgtm, one last comment which I'm trying to understand better.
854878d to
192671f
Compare
|
https://pulpito-ng.ceph.com/runs/sjust-2025-03-18_19:25:22-crimson-rados-wip-sjust-crimson-testing-2025-03-18-1742312874-distro-default-smithi has two failures with that fix, presumably I missed something, will fix tomorrow. |
192671f to
66f4791
Compare
|
That approach didn't work, still thinking. |
|
Ok, the primary doesn't make a point of sending the head before clones (and I don't think we want it to), and it's also possible for a clone and the head to be pushed concurrently. We need any client replica read to block until the push is readable. I'm creating some machinery to allow a clone obc to have a null ssc, but only if obtained through a specific interface used only in ReplicatedRecoveryBackend::handle_push (so, only on a replica). Normal accesses (including everywhere else even on the replica) will go through load_and_lock_clone, which always loads the head obc and ssc first. That pathway will gain a branch at the end to fix the ssc if it's null (will add an explanation). It's a bit less elegant that I'd hoped since it allows obc->ssc for clones on replica to be null in the cache, but I think it correctly reflects the fact that we simply may not have the head's ssc at the time of the push and the load_and_lock mechanisms should prevent other code from needing to worry about it. |
|
Testing a version of the above. This is probably not worth looking at again until I get a good test run. I'll unset draft when I get to that point. |
50e20e7 to
c0e154a
Compare
Calling set_state_obc twice on the same obc results in ObjectContext::list_link_cnt being stuck above 0. ~ObjectContext() then crashes when destructing the obc_accessing_hook. Add an assert as the above can be quite time consuming to track down. Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
…tine Signed-off-by: Samuel Just <sjust@redhat.com>
…utine Signed-off-by: Samuel Just <sjust@redhat.com>
More readable as one function now that it uses co_await. Signed-off-by: Samuel Just <sjust@redhat.com>
…n handle_push if complete Signed-off-by: Samuel Just <sjust@redhat.com>
pull_info.obc->ssc->exists must be true because we just set it, and obc->ssc->snapset.seq == pull_info.obc->ssc->snapset.seq must be true because we just set pull_info.obc to obc. Signed-off-by: Samuel Just <sjust@redhat.com>
… to have target_state preloaded Certain recovery users will need to be able to directly create and lock ObjectContext instances -- this allows the Manager machinery to work correctly in that case. Signed-off-by: Samuel Just <sjust@redhat.com>
c0e154a to
f2f2e79
Compare
|
5/5 tests passed https://pulpito.ceph.com/sjust-2025-03-21_04:30:23-crimson-rados-wip-sjust-crimson-testing-2025-03-20-1742518347-distro-default-smithi/ Marking ready for review and larger test. |
Matan-B
left a comment
There was a problem hiding this comment.
Overall lgtm, I'm glad we were able to resolve these obscure nuances early on!
I've left few suggestions and the only critical question here is regarding the head obc existence on the replica side.
…etadata Recovery will use this pathway to directly insert metadata from the attributes in the PushOp. Subsequent accessors of those obcs in ObjectContextLoader::load_and_lock* should simply attempt to lock the obc rather than trying to load them from disk. Signed-off-by: Samuel Just <sjust@redhat.com>
…ush_data When recovering an object, we know it cannot already be cached and we aren't interested in reading the current ondisk state. Instead, just create the correct obc from scratch. Signed-off-by: Samuel Just <sjust@redhat.com>
…p helper Signed-off-by: Samuel Just <sjust@redhat.com>
…ush_data in _handle_pull_response Simply construct obc directly from push data. Signed-off-by: Samuel Just <sjust@redhat.com>
We need a place to hold the obc for in-progress, multi message pushes on the replica side in order to lock the obc during the final commit. This could be introduced as a ReplicatedRecoveryBackend concept, but that seems unnecessarily complicated. Signed-off-by: Samuel Just <sjust@redhat.com>
…replica push commit We need to block reads on any recovery object between when we mark the object not missing and when the commit completes. Create obc from first push_op and use that obc during the final commit to create an ObjectContextLoader::Manager to lock. Signed-off-by: Samuel Just <sjust@redhat.com>
f2f2e79 to
482036c
Compare
|
jenkins test make check arm64 |
@athanatos, I could not identify the failed job here, possibly not related. Looks like 1 pg can't get out of backfilling state, I opened a tracker: https://tracker.ceph.com/issues/70765 |
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job Definition