osd/osd_types: inc-recovery - add special handler for lost_revert#29893
Merged
xiexingguo merged 4 commits intoceph:masterfrom Aug 27, 2019
Merged
osd/osd_types: inc-recovery - add special handler for lost_revert#29893xiexingguo merged 4 commits intoceph:masterfrom
xiexingguo merged 4 commits intoceph:masterfrom
Conversation
In general we shall build missing set (and hence clean_regions) based on pg log. However, currently there are still 5 cases we might call missing.add to add a new pg_missing_item into the missing set explicitly (or replace an existing pg_missing_item entirely): 1. we explicitly build missing set on startup, in which case we know we are trying to be compatiable with pre-kraken versions, so it should be ok for us to disable inc-recovery. 2. we are currently processing authoritative log, and there are some divergent objects detected. For simplicity (and correctness), we should disable inc-recovery entirly for these objects. 3. we are re-building missing set, e.g., due to the global CEPH_OSDMAP_RECOVERY_DELETES policy changing. In this case we know we are at the end of upgrading from a pervious version that is lack of CEPH_OSDMAP_RECOVERY_DELETES support. Hence it should be the recommended option to disable inc-recovery simultaneously since these objects should be lack of inc-recovery support too. 4. we are adding or re-adding missing object into primary's missing_loc. It doesn't matter whether we have a correct clean_regions there since we never actually refer to that field from missing_loc when we actually start to perform object recovery later. 5. we are auto-repairing a corrupted object and hence the need of adding it to the corresponding missing set first, e.g, by leveraging the existing recovery procedure. In this case, we always disable inc-recovery to make sure this object can be fully (and correctly) recovered later. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
because below here we know we'll always mark object as fully dirty. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
There is no consumer. Actually, I think this field is only meaningful to be used to indicate whether we should initiate an inc-recovery or not. If not, then we shall fall back to triggering a full-recovery instead. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
For unfound objects, we might append LOST_REVERT log entries, which shall allow these objects to be reverted to the newest available version later. However, we are currently lack of support to rewind the clean_regions portion too when marking unfound objects as lost with inc-recovery mode enabled. Hence we must mark these unfound-revert objects as fully dirty to make sure they can be correctly recovered. E.g.,: - primary is pulling object A from replica 1 - object A is corrupted on replica 1 - object A is now unfound - mark object A as lost, replica 1 will persist a wrong missing item for object A.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
jdurgin
approved these changes
Aug 26, 2019
Member
jdurgin
left a comment
There was a problem hiding this comment.
nice cleanups and thorough analysis!
NitzanMordhai
added a commit
to NitzanMordhai/ceph
that referenced
this pull request
Apr 7, 2025
Recent changes (PR ceph#29893) removed the “new_object” parameter from missing.add() and the pg_missing_item constructor. As a result, when processing delete log entries, if an object is found on disk, its on‑disk version is stored as “have” instead of the default eversion_t() (0'0). The invariant in read_log_and_missing() then fails because delete entries are expected to have “have” set to eversion_t(). This patch reintroduces the following check: if (have == eversion_t()) clean_regions.mark_object_new(); By doing so, we ensure that when the on‑disk “have” is default, the missing record is marked as new—restoring the previous behavior and satisfying the invariant for delete operations. Fixes: https://tracker.ceph.com/issues/45702 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com>
14 tasks
NitzanMordhai
added a commit
to NitzanMordhai/ceph
that referenced
this pull request
May 7, 2025
Recent changes (PR ceph#29893) removed the “new_object” parameter from missing.add() and the pg_missing_item constructor. As a result, when processing delete log entries, if an object is found on disk, its on‑disk version is stored as “have” instead of the default eversion_t() (0'0). The invariant in read_log_and_missing() then fails because delete entries are expected to have “have” set to eversion_t(). This patch reintroduces the following check: if (have == eversion_t()) clean_regions.mark_object_new(); By doing so, we ensure that when the on‑disk “have” is default, the missing record is marked as new—restoring the previous behavior and satisfying the invariant for delete operations. Fixes: https://tracker.ceph.com/issues/45702 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com> (cherry picked from commit 4289371)
NitzanMordhai
added a commit
to NitzanMordhai/ceph
that referenced
this pull request
May 7, 2025
Recent changes (PR ceph#29893) removed the “new_object” parameter from missing.add() and the pg_missing_item constructor. As a result, when processing delete log entries, if an object is found on disk, its on‑disk version is stored as “have” instead of the default eversion_t() (0'0). The invariant in read_log_and_missing() then fails because delete entries are expected to have “have” set to eversion_t(). This patch reintroduces the following check: if (have == eversion_t()) clean_regions.mark_object_new(); By doing so, we ensure that when the on‑disk “have” is default, the missing record is marked as new—restoring the previous behavior and satisfying the invariant for delete operations. Fixes: https://tracker.ceph.com/issues/45702 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com> (cherry picked from commit 4289371)
NitzanMordhai
added a commit
to NitzanMordhai/ceph
that referenced
this pull request
May 7, 2025
Recent changes (PR ceph#29893) removed the “new_object” parameter from missing.add() and the pg_missing_item constructor. As a result, when processing delete log entries, if an object is found on disk, its on‑disk version is stored as “have” instead of the default eversion_t() (0'0). The invariant in read_log_and_missing() then fails because delete entries are expected to have “have” set to eversion_t(). This patch reintroduces the following check: if (have == eversion_t()) clean_regions.mark_object_new(); By doing so, we ensure that when the on‑disk “have” is default, the missing record is marked as new—restoring the previous behavior and satisfying the invariant for delete operations. Fixes: https://tracker.ceph.com/issues/45702 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com> (cherry picked from commit 4289371)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard backendjenkins test docsjenkins render docs