Bug #66059
openOSD: PG stat is not synchronized between osds after deep-scrub
0%
Description
The way to reproduce it is
1. create a large omap object (more than 1gb)
2. do deep-scrub of the PG which contains that object.
3. large omap object count will 1(check `ceph health detail`)
4. Shutdown the primary osd which has control over that PG.
5. Some secondary osd will take control over, check the health detail again, warning will be gone.
A better way to identify this problem is logging in OSD::collect_pg_stat. When we created a larget omap object, it will be shown in pg stat of primary osd that large omap object count is 0. After deep-scrub it will be 1. If we shutdown the primary osd, then if we check the log of new primary osd we will get 0 in the pg stat of that pg.
root cause of the problem when we do some changes in the object then primary osd submit some transaction through `void ReplicatedBackend::submit_transaction` function which eventually call `Message * ReplicatedBackend::generate_subop`there it ships transaction including pg stat and log entries. But for deep-scrub even if it changes the pg stat of primary osd but it does not publish that pg stat to other osds.
Solution could be create some mechanism such that we can publish the pg stat to the non-primary osd after deep-scrub.
I am able to reproduce it in quincy v17.2.7 but other versions will have the problem as well.
Updated by Md Mahamudur Rahaman Sajib almost 2 years ago
- Assignee changed from Ronen Friedman to Md Mahamudur Rahaman Sajib
- Pull request ID set to 57582
Updated by Md Mahamudur Rahaman Sajib almost 2 years ago
- Status changed from New to In Progress
Updated by Radoslaw Zarzynski almost 2 years ago
Note from scrub: letting Ronen know.
Updated by Laura Flores almost 2 years ago
- Status changed from In Progress to Fix Under Review
Updated by Md Mahamudur Rahaman Sajib over 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Md Mahamudur Rahaman Sajib over 1 year ago
- Copied to Backport #68439: quincy: OSD: PG stat is not synchronized between osds after deep-scrub added
Updated by Md Mahamudur Rahaman Sajib over 1 year ago
- Copied to Backport #68440: reef: OSD: PG stat is not synchronized between osds after deep-scrub added
Updated by Md Mahamudur Rahaman Sajib over 1 year ago
- Copied to Backport #68441: squid: OSD: PG stat is not synchronized between osds after deep-scrub added
Updated by Md Mahamudur Rahaman Sajib over 1 year ago
- Tags (freeform) set to backport_processed
Updated by Ronen Friedman over 1 year ago
- Status changed from Pending Backport to In Progress
- Pull request ID deleted (
57582)
Reverted the status to Open, as PR #57582 creates test failures and will be reverted.
Updated by Ronen Friedman over 1 year ago ยท Edited
As far as I understand, the root-cause analysis in the description isn't correct.
Scrub-generated fix operations to the 'info' are indeed performed on the
Primary, but are published immediately (at the end of 'scrub_finish(), which - for our
omap case - is a few lines of code below the info update).
The update function is share_pg_info().
The part I am verifying now: seems that
PeeringState::proc_primary_info() would
(1) only update some scrub info data (and not the 'large omap' counter), and
(2) is only triggered when there are 'scrub errors' (num_scrub_errors > 0); and num_scrub_errors is 0 if our only problem is large omaps issues.
.
Updated by Ronen Friedman over 1 year ago
- Tags (freeform) deleted (
backport_processed)
Updated by Radoslaw Zarzynski over 1 year ago
- Assignee changed from Md Mahamudur Rahaman Sajib to Ronen Friedman
Updated by Konstantin Shalygin about 1 year ago
- Backport changed from squid, reef, quincy to squid, reef
Updated by Konstantin Shalygin about 1 year ago
- Status changed from In Progress to New
- Target version set to v20.0.0