Re-build remote cluster connections on credential changes#103460
Re-build remote cluster connections on credential changes#103460elasticsearchmachine merged 76 commits intoelastic:mainfrom
Conversation
…2798)" (elastic#103211)" This reverts commit 9db4cb4.
…elastic/elasticsearch into revert-103211-revert-102798-rcs2-reload
…build-connections
jakelandis
left a comment
There was a problem hiding this comment.
I haven't reviewed the tests yet, but looking good ... a couple comments.
| // We avoid stashing and marking context as system to keep the action as minimal as possible (i.e., avoid copying context) | ||
| remoteClusterService.updateRemoteClusterCredentials(request.getSettings()); | ||
| listener.onResponse(ActionResponse.Empty.INSTANCE); | ||
| assert Transports.assertNotTransportThread("Remote connection re-building is too much for a transport thread"); |
| ); | ||
| assert future.isDone() : "expecting local-only action call to return immediately on invocation"; | ||
| future.actionGet(0, TimeUnit.NANOSECONDS); | ||
| future.actionGet(10, TimeUnit.SECONDS); |
There was a problem hiding this comment.
I'm not sure we want a timeout here. If the timeout is hit it will not cancel the underlying work and other than returning sooner not sure what that buys us. It only encourages calling the API again which can compound the already unhealthy scenerio (and we will never be able to guess a correct upper bound unless it is huge). If anything the per-connection could have a timeout/retry ...but if does not already i don't see a need to add one now. i.e. use the explicit listener variant instead of the implicit listener used in the future to make the call to the action.
| return; | ||
| } | ||
|
|
||
| final PlainActionFuture<ActionResponse.Empty> future = new PlainActionFuture<>(); |
There was a problem hiding this comment.
Can we short circuit this entirely if the credentials did not change ?
There was a problem hiding this comment.
It's tricky: just the absence of credentials in the settings doesn't mean a noop (it could be that we removed credentials) -- anything else would require access to the remote cluster service which is not available here. I wanted to add a short-circuit here as well originally but didn't come up with a good way to do it.
| GroupedActionListener<Void> groupedListener | ||
| ) { | ||
| if (remoteClusters.containsKey(clusterAlias)) { | ||
| updateRemoteCluster(clusterAlias, settings, true, ActionListener.wrap(status -> { |
There was a problem hiding this comment.
does this handle tearing down a connection after removing a credential ?
There was a problem hiding this comment.
If a connection exists, and we remove a credential, updateRemoteCluster handles re-building it. If the connection does not exist, that means cluster settings for it were removed, and updateRemoteCluster will tear it down on the regular settings update call that removed the cluster settings. So if a connection does not exist anymore, and but a credential does still and is removed, there is nothing left to tear down. Let me know if I misinterpreted the question!
There was a problem hiding this comment.
Very solid work on this PR! It looks great. Easy to follow, good tests and a lot of thought has been put in to covering corner cases. I only have some optional comments.
One thing I couldn't really wrap my head around is if the reload of credentials is graceful or not? What happens to ongoing CCR and CCS requests when the credentials are reloaded? Are they just going to fail until the new connection is up? Maybe that's fine.
| final GroupedActionListener<Boolean> groupedListener = new GroupedActionListener<>( | ||
| totalConnectionsToRebuild, | ||
| listener.map(successFlags -> { | ||
| logger.info("rebuild complete for [{}] connections after credentials update", successFlags.size()); |
There was a problem hiding this comment.
Cool! I like the GroupedActionListener.
...sticsearch/xpack/security/action/settings/TransportReloadRemoteClusterCredentialsAction.java
Show resolved
Hide resolved
| Strings.collectionToCommaDelimitedString(clusterCredentials.keySet()) | ||
| ) | ||
| ); | ||
| public final synchronized UpdateRemoteClusterCredentialsResult updateClusterCredentials(Settings settings) { |
There was a problem hiding this comment.
This is only called from a single-threaded context. Might consider keeping it synchronized for future proofing, but it adds some overhead to acquire an extra lock.
There was a problem hiding this comment.
Yes great point. I meant to take a closer look at being less heavy handed with synchronization but didn't get to it. Thanks for prompting me to think about this again. I agree that with the current structure, we can remove synchronization here. I'll think on it and tweak some things 👍
There was a problem hiding this comment.
Discussed this on Slack also:
- I prefer to keep this one
synchronizedfor future proofing - I removed sorting the cluster aliases as that adds unnecessary overhead without utility -- originally, I used sorting with the notion to prevent deadlocks in
RemoteClusterService::updateRemoteClusterCredentials(since we iterate over aliases and call a synchronized method for each, we have a potential deadlock). HoweverupdateRemoteClusterCredentialsis itself syncronized at the top level, so we're safe.
Thanks for raising and discussing this 👍
| remoteClusters.put(clusterAlias, remote); | ||
| remote.ensureConnected(listener.map(ignored -> RemoteClusterConnectionStatus.CONNECTED)); | ||
| } else if (remote.shouldRebuildConnection(newSettings)) { | ||
| } else if (forceRebuild || remote.shouldRebuildConnection(newSettings)) { |
There was a problem hiding this comment.
nit: Since we know that this will always be true when we reload credentials, it might make sense to break this in to its own method to avoid the boolean in the method signature.
There was a problem hiding this comment.
agreed that the flag is not ideal but having stared at the method for a bit I don't see a good way to refactor it without basically duplicating it or using some sort of boolean flag in the end -- did you have a specific tweak in mind? partly, I wanted to avoid touching too much of the connection rebuilding code since it's fairly complex and outside of the security team's scope.
There was a problem hiding this comment.
Makes sense! It's fairly complex so I think it's out of scope to refactor any of that. 👍
| @@ -20,18 +22,80 @@ public void testResolveRemoteClusterCredentials() { | |||
| final String clusterAlias = randomAlphaOfLength(9); | |||
| final String otherClusterAlias = randomAlphaOfLength(10); | |||
|
|
|||
There was a problem hiding this comment.
If the CredentialsManager is expected to be thread safe, a concurrency test could be useful in this class.
There was a problem hiding this comment.
I didn't get around to this yet -- I think a concurrency test for RemoteClusterService::updateRemoteClusterCredentials would give us the most useful coverage. I think it's a really nice-to-add test but don't want to defer another round of review for it. My plan is to add it still.
There was a problem hiding this comment.
I'll log a follow up item to address this one in Jira to not block getting this out the door.
|
|
||
| import static org.hamcrest.Matchers.containsInAnyOrder; | ||
|
|
||
| // account for slow stored secure settings updates (involves removing and re-creating the keystore) |
There was a problem hiding this comment.
Would be interesting with a test case that blasts the API with concurrent reload requests combined with config updates.
There was a problem hiding this comment.
agreed. it wouldn't work with a REST level test because of the keystore file rewrite but I'll see if there is something lower-level we can do
|
@jfreden thanks for the review!
Not very graceful, indeed, and may cause CCS/CCR to fail. This is existing behavior, in that you can also trigger a connection rebuild by updating remote cluster settings. We recommend stopping long-running cross cluster operations in our migration guide: https://www.elastic.co/guide/en/elasticsearch/reference/current/remote-clusters-migrate.html#remote-clusters-migration-stop to avoid these sorts of failures. |
| try { | ||
| final PlainActionFuture<RemoteClusterConnectionStatus> future = new PlainActionFuture<>(); | ||
| updateRemoteCluster(clusterAlias, settings, true, future); | ||
| final RemoteClusterConnectionStatus status = future.actionGet(10, TimeUnit.SECONDS); |
There was a problem hiding this comment.
I do not like stacking the timeouts one after the other like this. If we have, say, 150 remotes, and they all hit the 10s timeout, then with this logic we'll be waiting for 1500s = 25 minutes. If we need to time out at all then we should have one top-level timeout, and TBH I'd rather it was in the caller seeing as how we don't do anything (except logging) when we hit the timeout anyway.
I know we have this problem when applying cluster states too, and IMO that's a bug. It's tricky to solve in that case because of the mechanics of how settings are applied, but I would rather avoid making the same mistake here too.
There was a problem hiding this comment.
Fair point. I don't feel strongly about the per-connection timeout (nor the top-level timeout for that matter).
If we have 150 remotes, and each connection takes >10s to rebuild the overall request will still wait for 25m. A top level timeout could prevent this but it's difficult to pick a reasonable default since I expect timeouts to vary greatly based on the network situation for a given connection (@jakelandis points this out as here which prompted me to consider a per-connection timeout) -- e.g., a healthy connection will rebuild in >1s but a slow/unstable network could result in long waits.
It could be nice to add a timeout parameter to the initial, generic reload-secure-settings request but we have no way of passing this down to the individual plugin reload call, since the reload() call only takes settings as input and I would prefer not to change the ReloadablePlugin interface here since that's a bigger change that requires separate input and deliberation.
The only other alternative that comes to mind is adding a dedicated setting where end-users can set a connection rebuild timeout but that feels hacky and like an overkill since, as you point out, we don't do anything (except logging) when we hit the timeout anyway.
Given that it's hard to come up with a generic top-level timeout, and that individual connection timeouts can give us very high (and pretty arbitrary) wait times, I prefer to remove the timeout altogether for now and possibly revisit this in the future. @DaveCTurner @jakelandis let me know if you have objections.
There was a problem hiding this comment.
I would be happy with no timeout here I think, especially if we trigger all the connection attempts in parallel since there's other timeouts in play lower down the stack, preventing this from really waiting forever.
I do think it's worth reconsidering the synchronous nature of the reload() API, which forces us to call each plugin in turn (or do something worse like spawning a thread per plugin). Stuff like this should be async all the way down the stack IMO and then these concerns wouldn't really arise.
server/src/main/java/org/elasticsearch/transport/RemoteClusterService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/RemoteClusterService.java
Outdated
Show resolved
Hide resolved
|
|
||
| public void testUpgradeFromRcs1() throws Exception { | ||
| // Setup RCS 1.0 and check that it works | ||
| configureRemoteCluster("my_remote_cluster", fulfillingCluster, true, randomBoolean(), randomBoolean()); |
There was a problem hiding this comment.
nit: would it be possible to assert no API key configured for my_remote_cluster ? Just a bit paraniod about ensuring this is really RCS 1.0. If it is anything but trivial, feel free to ignore this.
There was a problem hiding this comment.
configureRemoteCluster calls checkRemoteConnection under the hood which ensures that no credential is configured when basicSecurity is true here
Checking the keystore would be a bit tricky, off the top of my head. Hope this is sufficient!
| assumeFalse( | ||
| "Cannot run in FIPS mode since the keystore will be password protected and sending a password in the reload" | ||
| + "settings api call, requires TLS to be configured for the transport layer", | ||
| inFipsJvm() |
There was a problem hiding this comment.
requires TLS to be configured for the transport layer
where is this enforced ? Also, isn't TLS enabled for the tranport level in these tests ?
There was a problem hiding this comment.
Also, could you log an issue for follow up. Ideally we wouldn't skip these in FIPS mode.
There was a problem hiding this comment.
Ah good catch! The message is a copy-paste hoopla. FIPS is still a problem though, around insufficient password length for the keystore password. I'll tweak the message and log a follow up to address this.
|
@elasticmachine update branch |
|
@elasticmachine update branch |
This PR builds on #102798 by adding automatic remote connection rebuilding on cluster credentials changes. In particular, we rebuild a remote cluster connection if a credential for the associated cluster is newly added (i.e., we are moving from RCS 1.0 -> RCS 2.0) or removed (moving from RCS 2.0 -> RCS 1.0). A connection rebuild allows us to associate the correct profile (
_remote_serverin case of RCS 2.0, or "regular" transport profile for RCS 1.0) without requiring end-users to manually update remote cluster settings via a settings update call.More context on connection rebuilding also in this comment.
Relates: ES-6764