nautilus: New msgr2 crc and secure modes (msgr2.1)#35733
nautilus: New msgr2 crc and secure modes (msgr2.1)#35733yuriw merged 22 commits intoceph:nautilusfrom
Conversation
|
@yuriw This is still DNM. There is a subtle issue with nautilus that I'm currently tracking down. |
|
@idryomov we suspect this PR is failing builds |
|
@yuriw Correct, I haven't bothered pushing the fix for the build failure as there is a deeper issue here that is being investigated. |
This will cause unbalance between workes. Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com> (cherry picked from commit 5cf027d)
Provide an iterator-like interface as initializer lists cannot be formed dynamically. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 1fc5cc2)
It is unused and doesn't make much sense in TxHandler. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit b3e39b1)
OpenSSL supports in-place decryption so we can avoid allocating potentially multi-megabyte and strictly aligned buffer for each decryption operation. ProtocolV2 actually gets the alignment wrong: after read_frame_segment() allocates with cur_rx_desc.alignment, handle_read_frame_segment() effectively replaces that with segment_t::DEFAULT_ALIGNMENT. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit fe97a00)
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit e1d1f61)
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 6b6d405)
Use it in ProtocolV2.h and later in unit tests. While at it, drop the unused len struct. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit c081f3c) Conflicts: src/crimson/net/ProtocolV2.cc [ crimson doesn't support msgr2 in nautilus ] src/crimson/net/ProtocolV2.h [ ditto ]
Start separating frame assembly and disassembly code from frame sending, receiving and handling code, so that assembly and disassembly pieces can be unit tested and hopefully also shared between different messengers (e.g. crimson). This commit factors out the assembly code from Frame. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 872b125) Conflicts: src/crimson/CMakeLists.txt [ crimson doesn't support msgr2 in nautilus src/crimson/net/ProtocolV2.cc [ ditto ] src/crimson/net/ProtocolV2.h [ ditto ] src/msg/async/frames_v2.h [ commits f1cf408 ("msg/async/frames_v2.h: fix warning"), c70f779 ("headers: Make ceph_le member private"), 9908f0e ("msg: Add optimizing move") and 1a975fb ("msg/async: fix unnecessary 4 kB allocation in secure mode.") not in nautilus; fmt include adjusted as in rgw_lc.cc ]
Factor out the disassembly code from ProtocolV2 and switch ProtocolV2 to FrameAssembler. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit b9e0cfe) Conflicts: src/msg/async/ProtocolV2.cc [ context: commit d3ec4c01d17 ("msg: Build target 'common' without using namespace in headers") not in nautilus ]
l_msgr_recv_bytes calculation was never updated from msgr1. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit dcf30f5)
In preparation for msgr2,1, rename epilogue structs: epilogue_plain_block_t to epilogue_crc_rev0_block_t and epilogue_secure_block_t to epilogue_secure_rev0_block_t (rev0 stands for revision 0). Also, get rid of size constants that just disguise the struct type. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 712915b) Conflicts: src/crimson/net/ProtocolV2.cc [ crimson doesn't support msgr2 in nautilus ]
Clarify that the frame can be aborted at any point after the preamble and the first segment are put on the wire. When that happens, the remaining segments (including the data segment) may be filled with zeros. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit bc835b8) Conflicts: src/crimson/net/ProtocolV2.cc [ crimson doesn't support msgr2 in nautilus ] src/msg/async/frames_v2.h [ context: commit c70f779 ("headers: Make ceph_le member private") not in nautilus ]
Implement msgr2.1-crc and msgr2.1-secure modes. Issues with existing msgr2.0-crc and msgr2.0-secure modes and their resolution will be described in doc/dev/msgr2.rst. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 2966b2a) Conflicts: src/crimson/net/ProtocolV2.cc [ crimson doesn't support msgr2 in nautilus ]
Currently it's a mix of hex and dec, making it hard to grep for. Converge on hex to match client_cookie. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit 779a8df)
reuse_connection() can be called on exproto in BANNER_CONNECTING (i.e. without peer_supported_features and with tx/rx_frame_asm set to msgr2.0), but this state isn't carried over. If the donor connection is msgr2.1, this leads to repeated connection faults on crc or auth tag mismatches because we end up assembling 2.0 frames while the peer is expecting 2.1 frames. Fixes: https://tracker.ceph.com/issues/46180 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> (cherry picked from commit baff20a) Conflicts: src/test/msgr/test_msgr.cc [ commits 5e3aa5d ("ceph_test_msgr: remove simple") and bfb8c74 ("test/msgr: s/Mutex/ceph::mutex/") not in nautilus ]
adaac38 to
b9db4ab
Compare
|
-s rados/singleton-nomsgr --filter 'all/health-warnings rados' -N 20: |
|
@yuriw @neha-ojha Taking DNM off. |
I tracked it down to the interaction of |
|
@idryomov ok testing this alone |
|
-s rados/singleton-nomsgr --filter 'all/health-warnings rados' -N 200: @neha-ojha One suspicious failure, but it doesn't appear to be related to the messenger. osd.1 couldn't come up because osd.9 kept telling it that it had died ( |
|
@idryomov it passed tests and @neha-ojha reviewed/approved it https://trello.com/c/1RQihhoW |
doesn't look related |
|
@yuriw What about upgrade tests? |
|
@idryomov @neha-ojha what's next ? do we want this in next release? |
|
Yes, we do. I was waiting for Neha to look at the upgrade runs. Radek should review shortly. |
|
See more testing info https://trello.com/c/7kQyT5fs and per @neha-ojha |
|
I find it odd that no PendingReleaseNote was added for this... would it make sense to write one and add it to the 14.2.11 blog post and the 14.2.11 release notes in master? (If someone comes up with such a release note, I could take care of getting it into the right places.) |
Backport of #35078 (changes from #34927 which got merged via #35078 already in nautilus).
Moderate conflicts, all documented inline.