[1.3] Conditionally serialize with ODFE package if min version of node in cluster is <OS 1.0.0#2268
Conversation
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
|
@peternied Ready to put the new BWC framework to the test? |
peternied
left a comment
There was a problem hiding this comment.
I think this looks good thanks for jumping in! Writing this out to confirm the behavior:
- If a cluster contains ODFE nodes the class names will be rewritten in the legacy namespace (ODFE)
- When a cluster no longer contains any ODFE nodes, a cluster state change will be fired, and the class names will no longer be rewritten
- If a cluster contains NO ODFE nodes, the class names will not be rewritten
Doing a little mix-in of possible cluster states:
A. Cluster is transitioning from ODFE-latest to OS 1.3 - #1 will handle until migration completes
B. Cluster is transitioning from OS 1.3 to OS 2.4 - #2 will handle until migration completes
C. Cluster is transitioning from ODFE-latest to OS 2.4 - it will not work and is not supported
Seems reasonable, anything I'm missing here?
src/test/java/org/opensearch/security/support/Base64HelperTest.java
Outdated
Show resolved
Hide resolved
|
|
||
| private final DescriptorReplacer descriptorReplacer = new DescriptorReplacer(); | ||
|
|
||
| private boolean replaceWithOdfePackage = true; |
There was a problem hiding this comment.
Whenever possible make this fields final
There was a problem hiding this comment.
This is set in the constructor and cannot be set to final.
|
@peternied That's exactly right. Upgrade path has to go from the last minor version of the previous major to the current major. When migrating from ODFE to OS 2 it will go through 1.3, the path would be ODFE-1.13 -> OS 1.3 -> OS 2.x. |
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Codecov Report
@@ Coverage Diff @@
## 1.3 #2268 +/- ##
============================================
+ Coverage 64.60% 64.73% +0.13%
- Complexity 3219 3226 +7
============================================
Files 247 247
Lines 17352 17377 +25
Branches 3078 3097 +19
============================================
+ Hits 11210 11249 +39
+ Misses 4590 4578 -12
+ Partials 1552 1550 -2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
DarshitChanpura
left a comment
There was a problem hiding this comment.
This looks really good. Thank you @cwperks !
A general comment:
ClusterInfoHolder object is passed into threadContext a lot but only the field hasOdfeNodes is really used from it. IMO, we should only pass that field around and not the whole object. Thoughts?
src/test/java/org/opensearch/security/support/Base64HelperTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Craig Perkins <cwperx@amazon.com>
|
@DarshitChanpura |
My bad, I misread |
…rve not supported: sect571k1 [NIST K-571] (1.3.132.0.38) Signed-off-by: Craig Perkins <cwperx@amazon.com>
src/test/java/com/amazon/dlic/auth/http/jwt/HTTPJwtAuthenticatorTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
.github/workflows/ci.yml
Outdated
| name: build | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| fail-fast: true |
There was a problem hiding this comment.
Out of curiosity: Why remove this? Is it because of the intermittent/flaky failures?
There was a problem hiding this comment.
Yes, 1.3 CI often needs to be run multiple times. This will fail all jobs in the matrix if one fails. I removed this to prevent that from happening.
720f35b
|
Lets merge this as soon as CI has finished passing |
|
This looks great! |
Description
Conditionally apply serialization logic to replace the OpenSearch package name with the ODFE package name if the min version of a node in the cluster is < OS 1.0.
Bug fix
Rolling upgrades are broken from 1.3 to 2.x. During a rolling upgrade, traffic that hits the old OS 1.0 nodes and creates transport actions to the OS 2.x nodes will fail because the OS 2.x node does not understand the ODFE package naming that was serialized.
Issues Resolved
#2228
#2168
Testing
Tested by replicating the rolling upgrade failure on a cluster of three 1.3.6 nodes, killed 1 node and rebooted as a 2.0 node. Then sent requests to both a 1.3.6 node and a 2.0.0 node using:
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.