[WIP]crimson/osd/osd_operations/pg_advance: make PGAdvanceMap a pg_operation#51425
[WIP]crimson/osd/osd_operations/pg_advance: make PGAdvanceMap a pg_operation#51425
Conversation
eb27130 to
fcfe7a6
Compare
|
So the bug is that the PG already has a map epoch after to? Is it that we're processing the same maps from multiple MOSDMap messages? It's entirely possible, likely even, for an OSD to get several MOSDMap messages with overlapping map sequences close together since any peer OSD will send one if it thinks the OSD is out of date. Likely means we're not correctly maintaining the state that would avoid processing the same map more than once. |
@athanatos, I think this is caused by misusing Seastar's I will also note that in an early test run, it seems like this issue no longer appear with 56/60 passed (Crimson suite x3 times) [2] |
|
It's not so much that it's a misuse as that the code as written encodes an assumption that we don't initiate PGAdvanceMap on the same epoch twice. It seems to me like we really do want to enforce that invariant, so I'd like to understand how the PG advanced past that epoch in the first place. Tolerating that case seems at best inefficient and at worst would hide a problem. |
@athanatos, I have made some remarks based on a more thorough examination of this issue and provided an abstract solution/re-design to the current way we schedule a pg advance map operation. Regarding previous comment,
IIUC, this is not an issue of processing the same map more than once, this is being maintained successfully. |
@athanatos Right, it's wip. |
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Added logs If case reconstructed - epoch is unsigned. Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
…GAdvanceMap ctor No connection is passed as this is not a pg_operation Signed-off-by: Matan Breizman <mbreizma@redhat.com>
fcfe7a6 to
5eb7d11
Compare
| [this, conn, advance_from, advance_to, &state](auto& pg) { | ||
| return start_pg_operation_with_state<PGAdvanceMap>( | ||
| state, | ||
| conn, | ||
| shard_services.local(), | ||
| pg.second, advance_from, advance_to, |
There was a problem hiding this comment.
This essentially will allow us to schedule a PGAdvanceMap as an exclusive pg_operation with bounded advance_from and advance_to epochs.
(Pushed for early round of review)
0c5bb48 to
d340687
Compare
d340687 to
38ed821
Compare
…peration Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
…ional Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
38ed821 to
66c4485
Compare
|
Closing in favor of #51860 |
[eb27130] Fixes the following case:
Drafted for discussion - comments left in commit. The fix assumes that if the pg was already advanced to a later osdmap than
to- we can skip thePGAdvanceMapoperation.https://gist.github.com/Matan-B/67a6c2a9b9c581600799c2c37704e913
Fixes: https://tracker.ceph.com/issues/59165
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows