Do not migrate function in new atomic slot migration#2547
Conversation
If all cluster nodes have functions, slot migration will fail since the target will return the function already exists error when doing the FUNCTION LOAD. And in addition, the target's replica could panic when it executes the FUNCTION LOAD propagated from the primary (see propagation-error-behavior). Introduced in valkey-io#1949. Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #2547 +/- ##
============================================
+ Coverage 72.04% 72.17% +0.12%
============================================
Files 126 126
Lines 70615 70614 -1
============================================
+ Hits 50872 50963 +91
+ Misses 19743 19651 -92
🚀 New features to boost your workflow:
|
zuiderkwast
left a comment
There was a problem hiding this comment.
Makes sense to me. (I didn't review the test.)
Signed-off-by: Binbin <binloveplay1314@qq.com>
|
I don't think we need to mention this in the release notes. We just modified (fixed) behavior that's not yet released. |
We release this in RC1, didn't we? So when we release RC2 we need to mention the fix? |
Good point. But this was not documented behaviour. Should this PR be listed as a bugfix in the RC2 release notes? |
I thought this was intended behavior. How are users supposed to migrate functions? Through the CLI? |
|
@madolson Right. Migrating functions was not intended but it happened and it could cause trouble, so it's a bug in RC 1 and this is bugfix to be released in RC 2. Correct? Or don't we need to mention it in the release notes of rc2? |
|
We could mention it as a bug fix. I do wonder if we shouldn't just migrate the functions though? Instead of crashing we could send them with replace. |
|
In our fork, back to 7.0, when we adapted the function, i do converted it to a replace so that the target node will have the function (Noted that we are adding an empty target node to the cluster when doing the migration). And in the OSS version, at the time (7.0), i had the impression that functions were something that cluster administrators needed to load. For example, when adding a new node, the function needed to be loaded on all nodes (including the new one) (this is why we added function support to the CLI). In another scenario, in our OSS, the target node already exists in the cluster (in most cases), so it should already have the function. I guess we should identify whether a function needs migration or not. Note that in the old setslot migration, we did not migrate the function either. |
|
@enjoy-binbin Agree, adding functions is more related to adding a new node to the cluster. We could make |
If all cluster nodes have functions, slot migration will fail since the target will return the function already exists error when doing the FUNCTION LOAD. And in addition, the target's replica could panic when it executes the FUNCTION LOAD propagated from the primary (see propagation-error-behavior). Introduced in valkey-io#1949. Signed-off-by: Binbin <binloveplay1314@qq.com>
If all cluster nodes have functions, slot migration will fail since the target will return the function already exists error when doing the FUNCTION LOAD. And in addition, the target's replica could panic when it executes the FUNCTION LOAD propagated from the primary (see propagation-error-behavior). Introduced in #1949. Signed-off-by: Binbin <binloveplay1314@qq.com>
If all cluster nodes have functions, slot migration will fail since the target will return the function already exists error when doing the FUNCTION LOAD. And in addition, the target's replica could panic when it executes the FUNCTION LOAD propagated from the primary (see propagation-error-behavior). Introduced in valkey-io#1949. Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
If all cluster nodes have functions, slot migration will fail
since the target will return the function already exists error
when doing the FUNCTION LOAD.
And in addition, the target's replica could panic when it executes
the FUNCTION LOAD propagated from the primary (see propagation-error-behavior).
Introduced in #1949.