src/osd/OSDMap.cc: Fix encoder to produce same bytestream#55401
src/osd/OSDMap.cc: Fix encoder to produce same bytestream#55401rzarzynski merged 1 commit intoceph:mainfrom
Conversation
Fixes: https://tracker.ceph.com/issues/63389 Signed-off-by: Kamoltat <ksirivad@redhat.com>
|
Your description of the problem does not seem to match the bug. The bug indicates that the OSDs were sending "failed to encode map X with expected crc". That problem happens due to OSDs with newer binaries encoding with their newest known version rather than the older version and does not seem obviously related to the upgrade state on the mons. I'd like to see some kind of automated testing to prevent this in the future as well. |
|
What testing has been done to validate this fix? |
|
Apparently, the warning from the OSDs should have triggered a health warn in teuthology, but since we switched the deployment to cephadm, teuthology runs aren't writing the logs to the cluster log file and the grep for cluster warnings isn't working (and is thereby succeeding rather than failing). We need to fix that, confirm that it catches the currently broken implementation, and then use it to validate that this PR fixes the problem. |
|
I ran an initial validation: https://pulpito.ceph.com/ksirivad-2024-01-31_21:14:43-upgrade:quincy-x-main-distro-default-smithi/7541233/ (without patch) 2024-01-31T21:48:31.745+0000 7f9405310700 -1 mon.c@2(peon).osd e260 update_from_paxos full map CRC mismatch, resetting to canonical https://pulpito.ceph.com/ksirivad-2024-01-31_19:49:38-upgrade:quincy-x-wip-ksirivad-osdmap-encode-bug-distro-default-smithi/7541224/ (with patch) |
|
You're right essentially it is an OSD problem since the root cause is an OSDMap encoder, I'll change the PR description. I was just talking about the Monitors being in charge of applying the increments when there is an OSDMap change, the problem happens once the peon (running in a different version) fetches the increment from MonDB that was encoded with a different encoder, so once the peon applies the increment to it's OSDMap, it gives a different CRC output triggering the check. |
|
The next step is to validate that the fix is not breaking other things. |
The OSDs also apply incrementals -- most of the time that's how they receive a new map. If you think the bug is related to different monitors applying the incrementals differently, please add evidence for it to the bug. |
|
Ah yes, so here is an example of where OSDMonitor paxos service is pulling osdmap from mon_store and uses the encoding features: Lines 696 to 700 in a3bdffb Which then gives the error: I'll also add this to tracker for reference |
Problem:
The root cause was due to the OSDMap encoder of Reef (or higher) producing different byte streams than that of Quincy's. The problem might not be obvious when looking at just encoding/decoding point of view, however, the CRC value is calculated by looking at the buffer list size of the encoded objects. Therefore, the CRC values between Quincy and Reef are different.
When mons are running different versions, e.g.,
during upgrades, we sometimes will encounter a crc mismatch between the crc that is generated
from the leader MON (version n+1) and
the peon MON (version n). This is due to how the OSDMap encoding leaves a different byte stream between Quincy and Reef encoder.
Solution:
Fix the OSDMap encoding in later version such that
it will produce the same byte stream as version <= Quincy
Fixes: https://tracker.ceph.com/issues/63389
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e