crimson/osd: clear peer_missing entries for deleted objects by shraddhaag · Pull Request #67488 · ceph/ceph

shraddhaag · 2026-02-24T16:11:19Z

The Issue

This issue occurs when backfilling for an object and a client request to delete for the same object coincide. In such a case, there is a race to acquire the OBC lock between backfill and client delete.

Consider the scenario when the backfill is in progress and has already added the peer_missing entry for the given object. Now, one of two scenarios can happen:

When the lock is acquired first by the backfill, the object is recovered first, and then deleted by the client delete request. When recovering the object, the corresponding peer_missing entry is cleared when backfill recovers the object, the client delete sends a complete delete op to the backfilling OSD (hence, deleting the object) and we are able to transition to Recovered state successfully.

When the lock is acquired first by client delete request, the object is deleted. Then backfill tries to recover the object, finds it deleted and exits early. The stale peer_missing entry is not cleared. In Recovered::Recovered, needs_recovery() sees this stale peer_missing entry and calls abort.

The below image summaries the issue seen in https://pulpito.ceph.com/shraddhaag-2026-02-09_14:34:53-crimson-rados-main-distro-debug-trial/41478/.

The fix.

This issue is fixed by clearing out the stale peer_missing entries for the deleted object when object is discovered to be deleted by recover_object().

The fix was tested across many runs, all successfully passed:

Earlier, an alternative fix was to clear the stale peer_missing entries in enqueue_delete_for_backfill called by the client delete. This approach was abandoned as this can also lead to a race condition which might result in this same error. In such a case, the backfill would complete before the entries are cleared in enqueue_delete_for_backfill().

Fixes: https://tracker.ceph.com/issues/70501

QA Analysis

https://tracker.ceph.com/projects/crimson/wiki/MAIN#httpspulpitocephcomshraddhaag-2026-02-26_022256-crimson-rados-main-distro-debug-trial

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins test classic perf Jenkins Job | Jenkins Job Definition
jenkins test crimson perf Jenkins Job | Jenkins Job Definition
jenkins test signed Jenkins Job | Jenkins Job Definition
jenkins test make check Jenkins Job | Jenkins Job Definition
jenkins test make check arm64 Jenkins Job | Jenkins Job Definition
jenkins test submodules Jenkins Job | Jenkins Job Definition
jenkins test dashboard Jenkins Job | Jenkins Job Definition
jenkins test dashboard cephadm Jenkins Job | Jenkins Job Definition
jenkins test api Jenkins Job | Jenkins Job Definition
jenkins test docs ReadTheDocs | Github Workflow Definition
jenkins test ceph-volume all Jenkins Jobs | Jenkins Jobs Definition
jenkins test windows Jenkins Job | Jenkins Job Definition
jenkins test rook e2e Jenkins Job | Jenkins Job Definition

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

This commit fixes the abort in Recovered::Recovered. There is a race to acquire the OBC lock between backfill and client delete for the same object. When the lock is acquired first by the backfill, the object is recovered first, and then deleted by the client delete request. When recovering the object, the corresponding peer_missing entry is cleared and we are able to transition to Recovered state successfully. When the lock is acquired first by client delete request, the object is deleted. Then backfill tries to recover the object, finds it deleted and exists early. The stale peer_missing entry is not cleared. In Recovered::Recovered, needs_recovery() sees this stale peer_missing entry and calls abort. This issue is fixed by clearing out the stale peer_missing entries for the deleted object when object is being recovered by recover_object(). Earlier, an alternative fix was to clear the stale peer_missing entries in enqueue_delete_for_backfill called by the client delete. This approach was abandoned as this can also lead to a race condition which might result in this same error, if the bckfill completes before the entries are cleared in enqueue_delete_for_backfill(). Fixes: https://tracker.ceph.com/issues/70501 Signed-off-by: Shraddha Agrawal <shraddha.agrawal000@gmail.com>

amathuria

Looks good, approving.
Thanks for the detailed explanation @shraddhaag!

shraddhaag · 2026-03-02T06:56:36Z

jenkins test make check arm64

Matan-B

Good progress!
I think this approach hides the issue but we should instead make sure racing deletes are properly handled.
We can't skip recovery each time object is not found - there could be cases where the object is not found due to actual error and was not deleted.

Matan-B · 2026-03-02T09:37:48Z

src/crimson/osd/replicated_recovery_backend.cc

+	  // acquired the OBC lock. Clear stale peer_missing entries that were added
+	  // by prepare_backfill_for_missing() so that needs_recovery() does not
+	  // fail the assertion in Recovered::Recovered.
+	  DEBUGDPP("object does not exist, clearing peer_missing: {}", pg, soid);


We should consider the case where the object is being recovered and can't (for some catastrophic reason) be found without being deleted.
This must result in an error (abort/retry). Otherwise, we always let missing objects be marked as recovered - even on non-deleted objects.

Matan-B · 2026-03-02T09:41:33Z

src/crimson/osd/replicated_recovery_backend.cc

+	    if (pg.get_peering_state().get_peer_missing(peer).is_missing(soid, &item)) {
+	      DEBUGDPP("clear stale peer_missing entry {} v {} for peer {}",
+		       pg, soid, item.need, peer);
+	      pg.get_peering_state().on_peer_recover(peer, soid, item.need);


ReplicatedRecoveryBackend::recover_delete already pushes MOSDPGRecoveryDelete to all peers that still have the object marked as missing.

The above would be responsible for letting peer handle call on_local_recover with is_delete=true and handling the reply from this peer (via handle_recovery_delete_reply) and eventually call on_peer_recover.

I think it's wrong to skip the logic above, by calling on_peer_recover here. We trick peering_state into thinking "That peer recovered this object" - while this is not the case.

The issue here (prior to the changes in the PR) seems to be that we thought that we "Abort the recovery in this case .."(l. 46) but we didn't actually abort it and later on PeeringState "Recovered" would fail on ceph_assert(!ps->needs_recovery()); - since we some peers still think they have missing objects.

I would try to understand how we can let MOSDPGRecoveryDelete logic execute properly and/or dropping/aborting recovery for this object.
Alternatively, we could try making maybe_pull_missing_obj smarter and by not pulling deleted objects? Currently we only skip pulls if the object is not missing, perhaps this case could be further expanded to detect deletes.

ReplicatedRecoveryBackend::maybe_pull_missing_obj( { if (!local_missing.is_missing(soid)) { // object is not missing, don't pull return seastar::make_ready_future<>(); }

github-actions bot added the crimson label Feb 24, 2026

shraddhaag force-pushed the wip-shraddhaag-peering-bug branch from 3c87d3d to abffb61 Compare February 26, 2026 02:06

amathuria mentioned this pull request Feb 26, 2026

crimson/osd: fix PG splitting logic during map gaps #67225

Merged

14 tasks

shraddhaag added this to Crimson Feb 26, 2026

shraddhaag marked this pull request as ready for review February 26, 2026 13:51

shraddhaag moved this to Awaits review in Crimson Feb 26, 2026

shraddhaag requested a review from Matan-B February 26, 2026 13:51

shraddhaag changed the title ~~crimson/osd: clear peer_missing entried for deleted objects~~ crimson/osd: clear peer_missing entries for deleted objects Feb 26, 2026

Matan-B requested review from a team and amathuria February 26, 2026 14:44

amathuria approved these changes Mar 2, 2026

View reviewed changes

Matan-B requested changes Mar 2, 2026

View reviewed changes

Matan-B moved this from Awaits review to In Progress in Crimson Mar 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crimson/osd: clear peer_missing entries for deleted objects#67488

crimson/osd: clear peer_missing entries for deleted objects#67488
shraddhaag wants to merge 1 commit intoceph:mainfrom
shraddhaag:wip-shraddhaag-peering-bug

shraddhaag commented Feb 24, 2026 •

edited

Loading

Uh oh!

amathuria left a comment

Uh oh!

shraddhaag commented Mar 2, 2026

Uh oh!

Matan-B left a comment

Uh oh!

Matan-B Mar 2, 2026

Uh oh!

Matan-B Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shraddhaag commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The Issue

The fix.

QA Analysis

Checklist

Uh oh!

amathuria left a comment

Choose a reason for hiding this comment

Uh oh!

shraddhaag commented Mar 2, 2026

Uh oh!

Matan-B left a comment

Choose a reason for hiding this comment

Uh oh!

Matan-B Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Matan-B Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shraddhaag commented Feb 24, 2026 •

edited

Loading