-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Pass extensions to node if extension processing is handled by it #13465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ref: redis#12760 enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix introduces a flag `extensions_supported` on the clusterMsg indicating the sender node can handle extensions parsing. Once, a receiver nodes receives a message with this flag set to 1, it would update clusterNode new field extensions_supported and start sending out extensions if it has any. This PR also introduces a DEBUG sub command to enable/disable cluster message extensions `process-clustermsg-extensions` feature. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `extensions_supported` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). TCL test verifying the cluster state is healthy irrespective of enabling/disabling cluster message extensions feature. --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
|
note that i mark the hflag as |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| hdr->mflags[0] |= CLUSTERMSG_FLAG0_EXT_DATA; /* Always make other nodes know that | ||
| * this node supports extension data. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not familiar with this area in the code, maybe you can elaborate.
it looks like before your commit, we set that flag only when the other node doesn't support extensions.
which seems odd. if anything we should set new flags only when talking to new nodes?
although it does make mode sense to always set the flag (assuming old nodes ignore it, rather than breaking on unknown flags).
do you know why Valkey didn't do that too?
did you test the new code against a wide range of versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, Valkey now always sets this flag, but adds it when writing epingext.
I prefer to build hdr and always set this flag.
Valkey code:
if (link->node && nodeSupportsExtensions(link->node)) {
totlen += writePingExt(hdr, gossipcount); <- always set the mark
} else {
serverLog(LL_DEBUG, "Unable to send extensions data, however setting ext data flag to true");
hdr->mflags[0] |= CLUSTERMSG_FLAG0_EXT_DATA; <- if node doesn't support, we still set the flag.
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know why Valkey didn't do that too? did you test the new code against a wide range of versions?
yes, i tested it between 6.2, 7.0,and 7.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but they still set it conditionally. only when the node doesn't support extensions (or is NULL?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason of the issue is that the new version of the node sends ext data to nodes that do not support it.
This causes a failure due to a mismatch between the expected length (explen) and the actual total length of the received data.
Lines 2731 to 2736 in 731f2dc
| if ((totlen - explen) < extlen) { | |
| serverLog(LL_WARNING, "Received invalid %s packet with extension data that exceeds " | |
| "total packet length (%lld)", clusterGetMessageTypeString(type), | |
| (unsigned long long) totlen); | |
| return 1; | |
| } |
this PR does the follow things:
- always set CLUSTERMSG_FLAG0_EXT_DATA, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and brings ext data.
- if another node does not support ext data, we will not send it ext data to avoid the handshake failure mentioned above due to explen inconsistencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but they still set it conditionally. only when the node doesn't support extensions (or is NULL?)
no, if node support ext data, we always set the flag in the writePingExt no matter the extensions is 0 or not.
if (hdr != NULL) {
hdr->mflags[0] |= CLUSTERMSG_FLAG0_EXT_DATA;
hdr->extensions = htons(extensions);
}
so Valkey also sets this flag unconditionally.
|
@oranagra i updated the top comment, the original top comment is incorrect. |
### New Features in binary distributions - 7 new data structures: JSON, Time series, Bloom filter, Cuckoo filter, Count-min sketch, Top-k, t-digest - Redis scalable query engine (including vector search) ### Potentially breaking changes - #12272 `GETRANGE` returns an empty bulk when the negative end index is out of range - #12395 Optimize `SCAN` command when matching data type ### Bug fixes - #13510 Fix `RM_RdbLoad` to enable AOF after RDB loading is completed - #13489 `ACL CAT` - return module commands - #13476 Fix a race condition in the `cache_memory` of `functionsLibCtx` - #13473 Fix incorrect lag due to trimming stream via `XTRIM` command - #13338 Fix incorrect lag field in `XINFO` when tombstone is after the `last_id` of the consume group - #13470 On `HDEL` of last field - update the global hash field expiration data structure - #13465 Cluster: Pass extensions to node if extension processing is handled by it - #13443 Cluster: Ensure validity of myself when loading cluster config - #13422 Cluster: Fix `CLUSTER SHARDS` command returns empty array ### Modules API - #13509 New API calls: `RM_DefragAllocRaw`, `RM_DefragFreeRaw`, and `RM_RegisterDefragCallbacks` - defrag API to allocate and free raw memory ### Performance and resource utilization improvements - #13503 Avoid overhead of comparison function pointer calls in listpack `lpFind` - #13505 Optimize `STRING` datatype write commands - #13499 Optimize `SMEMBERS` command - #13494 Optimize `GEO*` commands reply - #13490 Optimize `HELLO` command - #13488 Optimize client query buffer - #12395 Optimize `SCAN` command when matching data type - #13529 Optimize `LREM`, `LPOS`, `LINSERT`, and `LINDEX` commands - #13516 Optimize `LRANGE` and other commands that perform several writes to client buffers per call - #13431 Avoid `used_memory` contention when updating from multiple threads ### Other general improvements - #13495 Reply `-LOADING` on replica while flushing the db ### CLI tools - #13411 redis-cli: Fix wrong `dbnum` showed after the client reconnected ### Notes - No backward compatibility for replication or persistence. - Additional distributions, upgrade paths, features, and improvements will be introduced in upcoming pre-releases. - With the GA release of 8.0 we will deprecate Redis Stack.
) This PR is based on the commits from PR valkey-io/valkey#52. Ref: #12760 Close #13401 This PR will replace #13449 Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix does the following things: 1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has. 2. If another node does not support ext data, we will not send it ext data to avoid the handshake failure due to the incorrect payload length. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
) This PR is based on the commits from PR valkey-io/valkey#52. Ref: #12760 Close #13401 This PR will replace #13449 Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix does the following things: 1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has. 2. If another node does not support ext data, we will not send it ext data to avoid the handshake failure due to the incorrect payload length. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
) This PR is based on the commits from PR valkey-io/valkey#52. Ref: #12760 Close #13401 This PR will replace #13449 Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix does the following things: 1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has. 2. If another node does not support ext data, we will not send it ext data to avoid the handshake failure due to the incorrect payload length. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
) This PR is based on the commits from PR valkey-io/valkey#52. Ref: #12760 Close #13401 This PR will replace #13449 Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix does the following things: 1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has. 2. If another node does not support ext data, we will not send it ext data to avoid the handshake failure due to the incorrect payload length. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
…is#13465) This PR is based on the commits from PR valkey-io/valkey#52. Ref: redis#12760 Close redis#13401 This PR will replace redis#13449 Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix does the following things: 1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has. 2. If another node does not support ext data, we will not send it ext data to avoid the handshake failure due to the incorrect payload length. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
…is#13465) This PR is based on the commits from PR valkey-io/valkey#52. Ref: redis#12760 Close redis#13401 This PR will replace redis#13449 Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) . With some of the extensions enabled by default in 7.2 version, new nodes running 7.2 and above start sending out larger clusterbus message payload including the ping extensions. This caused an incompatibility with node running engine versions < 7.0. Old nodes (< 7.0) would receive the payload from new nodes (> 7.2) would observe a payload length (totlen) > (estlen) and would perform an early exit and won't process the message. This fix does the following things: 1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has. 2. If another node does not support ext data, we will not send it ext data to avoid the handshake failure due to the incorrect payload length. Note: A successful `PING`/`PONG` is required as a sender for a given node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions message will be sent to it. This could cause a slight delay in receiving the extensions message(s). --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com> --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
### New Features in binary distributions - 7 new data structures: JSON, Time series, Bloom filter, Cuckoo filter, Count-min sketch, Top-k, t-digest - Redis scalable query engine (including vector search) ### Potentially breaking changes - redis#12272 `GETRANGE` returns an empty bulk when the negative end index is out of range - redis#12395 Optimize `SCAN` command when matching data type ### Bug fixes - redis#13510 Fix `RM_RdbLoad` to enable AOF after RDB loading is completed - redis#13489 `ACL CAT` - return module commands - redis#13476 Fix a race condition in the `cache_memory` of `functionsLibCtx` - redis#13473 Fix incorrect lag due to trimming stream via `XTRIM` command - redis#13338 Fix incorrect lag field in `XINFO` when tombstone is after the `last_id` of the consume group - redis#13470 On `HDEL` of last field - update the global hash field expiration data structure - redis#13465 Cluster: Pass extensions to node if extension processing is handled by it - redis#13443 Cluster: Ensure validity of myself when loading cluster config - redis#13422 Cluster: Fix `CLUSTER SHARDS` command returns empty array ### Modules API - redis#13509 New API calls: `RM_DefragAllocRaw`, `RM_DefragFreeRaw`, and `RM_RegisterDefragCallbacks` - defrag API to allocate and free raw memory ### Performance and resource utilization improvements - redis#13503 Avoid overhead of comparison function pointer calls in listpack `lpFind` - redis#13505 Optimize `STRING` datatype write commands - redis#13499 Optimize `SMEMBERS` command - redis#13494 Optimize `GEO*` commands reply - redis#13490 Optimize `HELLO` command - redis#13488 Optimize client query buffer - redis#12395 Optimize `SCAN` command when matching data type - redis#13529 Optimize `LREM`, `LPOS`, `LINSERT`, and `LINDEX` commands - redis#13516 Optimize `LRANGE` and other commands that perform several writes to client buffers per call - redis#13431 Avoid `used_memory` contention when updating from multiple threads ### Other general improvements - redis#13495 Reply `-LOADING` on replica while flushing the db ### CLI tools - redis#13411 redis-cli: Fix wrong `dbnum` showed after the client reconnected ### Notes - No backward compatibility for replication or persistence. - Additional distributions, upgrade paths, features, and improvements will be introduced in upcoming pre-releases. - With the GA release of 8.0 we will deprecate Redis Stack.
This PR is based on the commits from PR valkey-io/valkey#52.
Ref: #12760
Close #13401
This PR will replace #13449
Fixes compatibilty of Redis cluster (7.2 - extensions enabled by default) with older Redis cluster (< 7.0 - extensions not handled) .
With some of the extensions enabled by default in 7.2 version, new nodes
running 7.2 and above start sending out larger clusterbus message
payload including the ping extensions. This caused an incompatibility
with node running engine versions < 7.0. Old nodes (< 7.0) would receive
the payload from new nodes (> 7.2) would observe a payload length
(totlen) > (estlen) and would perform an early exit and won't process
the message.
This fix does the following things:
CLUSTERMSG_FLAG0_EXT_DATA, because during the meet phase, we do not know whether the connected node supports ext data, we need to make sure that it knows and send back its ext data if it has.Note: A successful
PING/PONGis required as a sender for a givennode to be marked as
CLUSTERMSG_FLAG0_EXT_DATAand then extensions messagewill be sent to it. This could cause a slight delay in receiving the
extensions message(s).
Signed-off-by: Harkrishn Patro harkrisp@amazon.com
Co-authored-by: Harkrishn Patro harkrisp@amazon.com