New msgr2 crc and secure modes (msgr2.1)#35078
Conversation
The secure mode uses AES-128-GCM with 96-bit nonces consisting of a 32-bit counter followed by a 64-bit salt. The counter is incremented after processing each frame, the salt is fixed for the duration of the session. Both are initialized from the session key generated during session negotiation, so the counter starts with essentially a random value. It is allowed to wrap, and, after 2**32 frames, it repeats, resulting in nonce reuse (the actual sequence numbers that the messenger works with are 64-bit, so the session continues on). Because of how GCM works, this completely breaks both confidentiality and integrity aspects of the secure mode. A single nonce reuse reveals the XOR of two plaintexts and almost completely reveals the subkey used for producing authentication tags. After a few nonces get used twice, all confidentiality and integrity goes out the window and the attacker can potentially encrypt-authenticate plaintext of their choice. We can't easily change the nonce format to extend the counter to 64 bits (and possibly XOR it with a longer salt). Instead, just remember the initial nonce and cut the session before it repeats, forcing renegotiation. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
As a AES-GCM IV, nonce_t is implicitly shared between server and client. Currently, if their endianness doesn't match, they are unable to communicate in secure mode because each gets its own idea of what the next nonce should be after the counter is incremented. Several RFCs state that the nonce counter should be BE, but since we use LE for everything on-disk and on-wire, make it LE. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Matt Benjamin <mbenjamin@redhat.com> Reviewed-by: Casey Bodley <cbodley@redhat.com> (cherry picked from commit d8dd5e513c0c62bbd7d3044d7e2eddcd897bd400)
As per Robin's comments and S3 spec Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
S3 GetObject permits overriding response header values, but those inputs need to be validated to insure only characters that are valid in an HTTP header value are present. Credit: Initial vulnerability discovery by William Bowling (@wcbowling) Credit: Further vulnerability discovery by Robin H. Johnson <rjohnson@digitalocean.com> Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
Provide an iterator-like interface as initializer lists cannot be formed dynamically. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Add asserts to avoid bugs like the one fixed in 1a975fb ("msg/async: fix unnecessary 4 kB allocation in secure mode."). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
It is unused and doesn't make much sense in TxHandler. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
@cyx1231st just in case. |
|
jenkins test make check |
rzarzynski
left a comment
There was a problem hiding this comment.
Comments from 1st part of the review. So far looks good.
rzarzynski
left a comment
There was a problem hiding this comment.
In progress. Generally looks good so far!
src/msg/async/frames_v2.h
Outdated
| FrameAssembler(const ceph::crypto::onwire::rxtx_t* crypto) | ||
| : m_crypto(crypto) {} | ||
| FrameAssembler(const ceph::crypto::onwire::rxtx_t* crypto, bool is_r1) | ||
| : m_crypto(crypto), m_is_r1(is_r1) {} |
rzarzynski
left a comment
There was a problem hiding this comment.
The changes generally look good! Some nits and questions.
I'm moving forward to a brief performance comparison.
src/test/msgr/test_frames_v2.cc
Outdated
| return bl; | ||
| } | ||
|
|
||
| bool disassemble_frame(FrameAssembler* frame_asm, bufferlist* frame_bl, |
There was a problem hiding this comment.
nit: one param per line.
|
I made a really brief performance evaluation. For small chunk we don't need to worry; for bigger ones there is a nice improvement in latency. I think we can move forward. |
OpenSSL supports in-place decryption so we can avoid allocating potentially multi-megabyte and strictly aligned buffer for each decryption operation. ProtocolV2 actually gets the alignment wrong: after read_frame_segment() allocates with cur_rx_desc.alignment, handle_read_frame_segment() effectively replaces that with segment_t::DEFAULT_ALIGNMENT. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Use it in ProtocolV2.h and later in unit tests. While at it, drop the unused len struct. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
@rzarzynski Implemented reserving before appending and addressed most of the style nits (including using mutable references, so the diff is quite large but mostly mechanical). |
Since I haven't received any other comments on the wire format itself, I'm taking off the RFC. |
Start separating frame assembly and disassembly code from frame sending, receiving and handling code, so that assembly and disassembly pieces can be unit tested and hopefully also shared between different messengers (e.g. crimson). This commit factors out the assembly code from Frame. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Factor out the disassembly code from ProtocolV2 and switch ProtocolV2 to FrameAssembler. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
l_msgr_recv_bytes calculation was never updated from msgr1. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
In preparation for msgr2,1, rename epilogue structs: epilogue_plain_block_t to epilogue_crc_rev0_block_t and epilogue_secure_block_t to epilogue_secure_rev0_block_t (rev0 stands for revision 0). Also, get rid of size constants that just disguise the struct type. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Clarify that the frame can be aborted at any point after the preamble and the first segment are put on the wire. When that happens, the remaining segments (including the data segment) may be filled with zeros. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
@rzarzynski One more fixup: moved the assert into |
Implement msgr2.1-crc and msgr2.1-secure modes. Issues with existing msgr2.0-crc and msgr2.0-secure modes and their resolution will be described in doc/dev/msgr2.rst. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Move to a 64-bit counter to avoid wrapping and having to reset the session before the counter repeats. This is in line with NIST Recommendation for GCM [1]: "... this Recommendation suggests, but does not require, that the leading (i.e., leftmost) 32 bits of the IV hold the fixed field; and that the trailing (i.e., rightmost) 64 bits hold the invocation field." See commit bb61e6a ("msg/async/ProtocolV2: avoid AES-GCM nonce reuse vulnerabilities"). [1] https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
In both msgr2.0 and msgr2.1, segments can be empty. In msgr2.1, epilogue can be empty as well. Handle both by calling the respective handler function directly instead of allocating a buffer::ptr_node for an empty buffer and passing that through READ[_RXBUF]. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
We aren't interested in peer_required_features anywhere outside _handle_peer_banner_payload() -- once we know there is no mismatch, it's all about peer_supported_features. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Use msgr2.1 if the peer supports it and fall back to msgr2.0 otherwise. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| } | ||
|
|
||
| preamble_block_t preamble; | ||
| fill_preamble(tag, preamble); |
There was a problem hiding this comment.
It looks this is the sole caller of fill_preamble(), so returning preamble_block_t seems pretty straightforward. Anyway, this is really a minor as the method is internal to FrameAssembler.
| // more than one segment (i.e. at least one of second to fourth | ||
| // segments is not empty). In crc mode, it stores crcs for | ||
| // second to fourh segments; the preamble and the first segment | ||
| // are covered by their own crcs. In secure mode, the epilogue |
There was a problem hiding this comment.
Yeah, the stuff got more complex. It might be worth adding here a single sentence about the krbd-driven rationale for having the first segment available (we have something like that in msgr2.rst but the comment would be much closer).
Still, not a blocker.
There was a problem hiding this comment.
While it is true that this change is driven by the kernel client, we may want to resurrect the option of receiving into preallocated buffers in userspace at some point too. Having the header crc on the header itself is what msgr1 did (i.e. this goes back to the dawn of ceph), so I don't think the rationale needs stressing here. It's really just a regression fix.
|
A critical follow-on fix for the issue exposed by msgr2.1: #35816. |
This is based on @theanalyst's upstream-rgw-msgr-fixes branch for now (#34927). I'll rebase once it merges.