Skip to content

osd: Add force-reremove-snap mon command#53235

Closed
Matan-B wants to merge 2 commits intoceph:mainfrom
Matan-B:wip-matanb-reremove-snap-only
Closed

osd: Add force-reremove-snap mon command#53235
Matan-B wants to merge 2 commits intoceph:mainfrom
Matan-B:wip-matanb-reremove-snap-only

Conversation

@Matan-B
Copy link
Contributor

@Matan-B Matan-B commented Aug 31, 2023

This PR is taken out of #52971 since this is an independent change that may be useful in various possible repair cases.
Splitting this change will also allow better testing coverage for each part of the original fix before merging.


Forces re-removal of already removed snapshots in the range
[lower_snapid_bound, upper_snapid_bound) on pool <pool>
in order to cause OSDs to re-trim them after fixing malformed
snap_mapper entries.
See bug #62596 for details.
See also OSD admin socket command fix_malformed_snapmapper_keys.

Fixes: Part 2/3 - https://tracker.ceph.com/issues/62596

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@Matan-B Matan-B requested a review from a team as a code owner August 31, 2023 08:37
@Matan-B Matan-B force-pushed the wip-matanb-reremove-snap-only branch from 5d8ac17 to 15e83e2 Compare August 31, 2023 09:59
@github-actions github-actions bot added the tests label Aug 31, 2023
@Matan-B Matan-B force-pushed the wip-matanb-reremove-snap-only branch 7 times, most recently from 920cb11 to e3b90da Compare September 3, 2023 11:10
```
Forces re-removal of already removed snapshots in the range
[lower_snapid_bound, upper_snapid_bound) on pool <pool>
in order to cause OSDs to re-trim them after fixing malformed
snap_mapper entries.
See bug ceph#62596 for details.
See also OSD admin socket command fix_malformed_snapmapper_keys.
```

Fixes: Part 2/3 - https://tracker.ceph.com/issues/62596

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-reremove-snap-only branch 2 times, most recently from 04e7427 to c0e9403 Compare September 3, 2023 13:48
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
@Matan-B Matan-B force-pushed the wip-matanb-reremove-snap-only branch from c0e9403 to 2ad5a5e Compare September 3, 2023 16:02
@Matan-B
Copy link
Contributor Author

Matan-B commented Sep 4, 2023

@athanatos, I'm working on an alternative version that will reuse snap_trimq_repeat introduced in 6192fb6 and used by scrub.

It looks like reusing pg->queue_snap_retrim() mechanism is safer and it may fit well here as well.
See: main...Matan-B:ceph:wip-matanb-reremove-snap-only-2

The main difference is that for finding the purged_snaps the second version will use the osd's PURGED_SNAP_PREFIX rather than the mon's purged_snap_ keys. The actual trimming will be triggered using queue_snap_retrim/snap_trimq_repeat. See how PrimaryLogPG::AwaitAsyncWork::react handles snaps found in snap_trimq_repeat.

When we discover mapped objects that are covered by ranges of snaps that
should be purged, we requeue the snapid for trim on the relevant PG(s).
For these 'repeat' trims we skip the final step(s) to mark the snapid as
purged, since that presumably already happened some time ago.

CC: @ronen-fr

@Matan-B
Copy link
Contributor Author

Matan-B commented Sep 6, 2023

Closing, See tracker for resolving this issue.
https://tracker.ceph.com/issues/62596

@Matan-B Matan-B closed this Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant