osd: Optimized EC missing list not updated on recovering shard#65747
osd: Optimized EC missing list not updated on recovering shard#65747bill-scales wants to merge 1 commit intoceph:mainfrom
Conversation
|
Think this is a fix for https://tracker.ceph.com/issues/73249, doing some local testing to check for other regressions and waiting for shaman to recover so I can kick off some teuthology tests |
7f2222d to
722e9c9
Compare
|
Test results so far for the latest version of the fix: https://pulpito.ceph.com/billscales-2025-10-08_07:00:25-rados:thrash-erasure-code-optimizations-main-distro-default-smithi/ Lots of failed jobs due to scrubbing generating false data inconsistency warnings, a few infrastructure issues, 1 recreate of the bug that Alex has fixed, 1 recreate of the stats bug that Jon has fixed, 1 failed run due to an overaggressive error inject |
722e9c9 to
7cb12d0
Compare
|
Other than 38 scrub data inconsistency reports for otherwise successful runs which Jon has a fix for, a clean run |
| stats_last_update[from] = oinfo.last_update; | ||
|
|
||
| update_peer_info(from, oinfo); | ||
| update_peer_info(from, oinfo, BEFORE_ACTIVATE); |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
|
@bill-scales pls rebase |
Shards that are recovering (last_complete != last_update) that skip transactions and log entries because of partial writes are using pwlc to advance last_update. However simply incrementing last_update is not sufficient - there are scenarios where the needed version of a missing object has to be updated. If the shard is already missing object X at version V1 and there was a partial write at V2 that did not update the shard, it does not need to retain the log entry, but it does need to update the missing list to say it needs V2 rather than V1. This ensures all shards report a need for an object at the same version and avoids an assert in MissingLoc::add_active_missing when the primary is trying to combine the missing lists from all the shards to work out what has to be recovered. Avoiding applying pwlc during the early phase of the peering process ensures the missing list gets updated. However if a shard is not missing object X and there was a partial write at V2 that did not update the shard then at the end of peering it is still necessary to advance last_upadte by applying pwlc. This ensures that in later peering cycles the code does not change its mind and think the shard is now missing object X. The fix is to be more sophisticated about when pwlc can be used to advance last_update for a recovering shard. The code now passes in a parameter indicating whether we are in the early (pre activate) or later phase of peering. This also means that additional calls to apply_pwlc are needed when peering gets to activating and is searching for missing to make updates that were not made earlier. Fixes: https://tracker.ceph.com/issues/73249 Signed-off-by: Bill Scales <bill_scales@uk.ibm.com>
7cb12d0 to
56a0687
Compare
|
Rebased - addressed trivial merge conflict + s/peering_state/peering_state_t/g to address minor review comment |
|
jenkins test make check arm64 |
|
Testing in here: https://tracker.ceph.com/issues/73711 |
1 similar comment
|
Testing in here: https://tracker.ceph.com/issues/73711 |
|
Hey @bill-scales, We came across an instance of what looks to be https://tracker.ceph.com/issues/73249 during qa testing, which I believe is the issue you were fixing with this. Teuthology run: Call stack: Could you please take a look? |
|
Add the "needs-qa" label back when the above comment is addressed. |
|
Test failure during the QA run shows that this fix has improved things but still isn't quite right. A bit more tuning is required... |
|
@bill-scales: DNM for now? |
|
Added DNM for now- let's add "needs-qa" when ready! |
Shards that are recovering (last_complete != last_update) are using pwlc to advance last_update for writes that did not effect the shard. However simply incrementing last_update means that the primary doesnt send the shard log entries that it missed and consequently it cannot update its missing list.
If the shard is already missing object X at version V1 and there was a partial write at V2 that did not update the shard, it does not need to retain the log entry, but it does need to update the missing list to say it needs V2 rather than V1. This ensures all shards report a need for an object at the same version and avoids an assert in MissingLoc::add_active_missing when the primary is trying to combine the missing lists from all the shards to work out what has to be recovered.
The fix is to avoid applying pwlc when last_complete != last_update, this forces the primary to send the log to the recovering shard which can then update its missing list (and discarding the log entries as they are partial writes).
Fixes: https://tracker.ceph.com/issues/73249
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.