Skip to content

[1.3] Conditionally serialize with ODFE package if min version of node in cluster is <OS 1.0.0#2268

Merged
cwperks merged 12 commits intoopensearch-project:1.3from
cwperks:bwc-fix-1.3-to-2.x
Nov 28, 2022
Merged

[1.3] Conditionally serialize with ODFE package if min version of node in cluster is <OS 1.0.0#2268
cwperks merged 12 commits intoopensearch-project:1.3from
cwperks:bwc-fix-1.3-to-2.x

Conversation

@cwperks
Copy link
Copy Markdown
Member

@cwperks cwperks commented Nov 18, 2022

Description

Conditionally apply serialization logic to replace the OpenSearch package name with the ODFE package name if the min version of a node in the cluster is < OS 1.0.

  • Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation)

Bug fix

  • Why these changes are required?

Rolling upgrades are broken from 1.3 to 2.x. During a rolling upgrade, traffic that hits the old OS 1.0 nodes and creates transport actions to the OS 2.x nodes will fail because the OS 2.x node does not understand the ODFE package naming that was serialized.

Issues Resolved

#2228
#2168

Testing

Tested by replicating the rolling upgrade failure on a cluster of three 1.3.6 nodes, killed 1 node and rebooted as a 2.0 node. Then sent requests to both a 1.3.6 node and a 2.0.0 node using:

// opensearch-node1 is 2.0.0 node
// opensearch-node2 and opensearch-node3 are 1.3.6 nodes

curl -XGET https://admin:admin@opensearch-node1:9200/_nodes --insecure
// output starts with {"_nodes":{"total":3,"successful":3,"failed":0}

curl -XGET https://admin:admin@opensearch-node2:9200/_nodes --insecure
// output starts with {"_nodes":{"total":3,"successful":2,"failed":1,"failures":[{"type":"failed_node_exception","reason":"Failed node [k9FJELd0Q_6uyTTSUgNCLA]","node_id":"k9FJELd0Q_6uyTTSUgNCLA","caused_by":{"type":"exception","reason":"java.lang.ClassNotFoundException: com.amazon.opendistroforelasticsearch.security.user.User","caused_by":{"type":"class_not_found_exception","reason":"class_not_found_exception: com.amazon.opendistroforelasticsearch.security.user.User"}}}]}

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks
Copy link
Copy Markdown
Member Author

cwperks commented Nov 18, 2022

@peternied Ready to put the new BWC framework to the test?

Copy link
Copy Markdown
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good thanks for jumping in! Writing this out to confirm the behavior:

  1. If a cluster contains ODFE nodes the class names will be rewritten in the legacy namespace (ODFE)
    • When a cluster no longer contains any ODFE nodes, a cluster state change will be fired, and the class names will no longer be rewritten
  2. If a cluster contains NO ODFE nodes, the class names will not be rewritten

Doing a little mix-in of possible cluster states:
A. Cluster is transitioning from ODFE-latest to OS 1.3 - #1 will handle until migration completes
B. Cluster is transitioning from OS 1.3 to OS 2.4 - #2 will handle until migration completes
C. Cluster is transitioning from ODFE-latest to OS 2.4 - it will not work and is not supported

Seems reasonable, anything I'm missing here?


private final DescriptorReplacer descriptorReplacer = new DescriptorReplacer();

private boolean replaceWithOdfePackage = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever possible make this fields final

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is set in the constructor and cannot be set to final.

@cwperks
Copy link
Copy Markdown
Member Author

cwperks commented Nov 18, 2022

@peternied That's exactly right. Upgrade path has to go from the last minor version of the previous major to the current major. When migrating from ODFE to OS 2 it will go through 1.3, the path would be ODFE-1.13 -> OS 1.3 -> OS 2.x.

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Nov 18, 2022

Codecov Report

Merging #2268 (720f35b) into 1.3 (983385f) will increase coverage by 0.13%.
The diff coverage is 73.46%.

@@             Coverage Diff              @@
##                1.3    #2268      +/-   ##
============================================
+ Coverage     64.60%   64.73%   +0.13%     
- Complexity     3219     3226       +7     
============================================
  Files           247      247              
  Lines         17352    17377      +25     
  Branches       3078     3097      +19     
============================================
+ Hits          11210    11249      +39     
+ Misses         4590     4578      -12     
+ Partials       1552     1550       -2     
Impacted Files Coverage Δ
...earch/security/privileges/PrivilegesEvaluator.java 71.57% <0.00%> (ø)
...g/opensearch/security/rest/SecurityInfoAction.java 58.00% <33.33%> (+0.55%) ⬆️
...org/opensearch/security/filter/SecurityFilter.java 72.22% <66.66%> (-0.10%) ⬇️
...arch/security/configuration/ClusterInfoHolder.java 66.66% <71.42%> (+0.81%) ⬆️
...pensearch/security/privileges/DlsFlsEvaluator.java 51.61% <75.00%> (-0.06%) ⬇️
...search/security/transport/SecurityInterceptor.java 73.55% <75.00%> (-1.24%) ⬇️
.../org/opensearch/security/support/Base64Helper.java 74.50% <90.00%> (+1.87%) ⬆️
.../opensearch/security/OpenSearchSecurityPlugin.java 80.92% <100.00%> (ø)
...a/org/opensearch/security/tools/SecurityAdmin.java 47.59% <0.00%> (+0.19%) ⬆️
... and 8 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Copy Markdown
Member

@DarshitChanpura DarshitChanpura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Thank you @cwperks !

A general comment:
ClusterInfoHolder object is passed into threadContext a lot but only the field hasOdfeNodes is really used from it. IMO, we should only pass that field around and not the whole object. Thoughts?

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks
Copy link
Copy Markdown
Member Author

cwperks commented Nov 18, 2022

@DarshitChanpura ClusterInfoHolder is a lightweight object and can be used globally. Where do you see if being set on the thread context? ClusterInfoHolder is the class that is subscribed to ClusterChangedEvent and updates as it encounters new events. I think this is the correct way to go, but I'm not positive at this point if there are side effects of moving it to be a class variable of OpenSearchSecurityPlugin. This is still in Draft and I plan to do more testing before converting this to ready for review, but wanted to put a PR out to get initial feedback to see if this approach would work.

@DarshitChanpura
Copy link
Copy Markdown
Member

@DarshitChanpura ClusterInfoHolder is a lightweight object and can be used globally. Where do you see if being set on the thread context? ClusterInfoHolder is the class that is subscribed to ClusterChangedEvent and updates as it encounters new events. I think this is the correct way to go, but I'm not positive at this point if there are side effects of moving it to be a class variable of OpenSearchSecurityPlugin. This is still in Draft and I plan to do more testing before converting this to ready for review, but wanted to put a PR out to get initial feedback to see if this approach would work.

My bad, I misread packageBehaviour with ClusterInfoHolder for some reason. If CIH is lightweight and is prone to updates from ClusterChangedEvent then it does make sense to use the reference instead of passing a value.

…rve not supported: sect571k1 [NIST K-571] (1.3.132.0.38)

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks cwperks marked this pull request as ready for review November 22, 2022 14:25
@cwperks cwperks requested a review from a team November 22, 2022 14:25
Copy link
Copy Markdown
Member

@DarshitChanpura DarshitChanpura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty @cwperks !

name: build
runs-on: ubuntu-latest
strategy:
fail-fast: true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: Why remove this? Is it because of the intermittent/flaky failures?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, 1.3 CI often needs to be run multiple times. This will fail all jobs in the matrix if one fails. I removed this to prevent that from happening.

peternied
peternied previously approved these changes Nov 28, 2022
@peternied peternied dismissed stale reviews from DarshitChanpura and themself via 720f35b November 28, 2022 15:01
@peternied
Copy link
Copy Markdown
Member

Lets merge this as soon as CI has finished passing

@peternied peternied closed this Nov 28, 2022
@cwperks cwperks reopened this Nov 28, 2022
@stephen-crawford
Copy link
Copy Markdown
Contributor

This looks great!

@cwperks cwperks merged commit 00e40e4 into opensearch-project:1.3 Nov 28, 2022
@cwperks cwperks mentioned this pull request Jun 8, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants