Skip to content

Transport action for ClusterFormationFailureHelper warnings#85782

Closed
andreidan wants to merge 4 commits intoelastic:masterfrom
andreidan:transport-action-cluster-formation-info
Closed

Transport action for ClusterFormationFailureHelper warnings#85782
andreidan wants to merge 4 commits intoelastic:masterfrom
andreidan:transport-action-cluster-formation-info

Conversation

@andreidan
Copy link
Copy Markdown
Contributor

This adds a transport action and the necessary infrastructure to make
the cluster formation warning messages available remotely via a
transport action.

Relates to #85624

This adds a transport action and the necessary infrastructure to make
the cluster formation warning messages available remotely via a
transport action.
@andreidan andreidan added >feature :Distributed/Health Issues for the health report API v8.3.0 labels Apr 11, 2022
@andreidan andreidan requested a review from DaveCTurner April 11, 2022 12:56
@elasticmachine elasticmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Apr 11, 2022
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @andreidan, I've created a changelog YAML for you.

@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/clients-team (Team:Clients)

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I left some (hopefully) simplifying suggestions.


import static org.elasticsearch.rest.RestRequest.Method.GET;

public class RestClusterFormationInfoAction extends BaseRestHandler {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to expose this at the REST layer. In practice we're only going to use it from within the cluster, i.e. as a transport action.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah fair enough, I was possibly envisioning adding more information to this API but that's probably not going to happen.

}

public Optional<String> getClusterFormationWarning() {
return clusterFormationFailureHelper.emittedWarning();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to need to need a structured response, e.g. to detect partial network partitions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started out making tweaks to this PR for this and wound up just putting up this draft instead -- #87306. Before I put more effort into that, do you think it's appropriate to expose all of ClusterFormationState? We seem to need most of the information in it (if not all of it), but it does seem a little odd to publicly expose a private record like that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only a private record because we've never done anything with this info except log it before. Definitely makes sense to me to make it public and use it more widely.

* @return The result future
* @see org.elasticsearch.client.internal.Requests#clusterFormationInfoRequest(String...)
*/
ActionFuture<ClusterFormationInfoResponse> clusterFormationInfo(ClusterFormationInfoRequest request);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally wouldn't bother adding dedicated methods to the client for an API that's not going to be generally useful. You can just call Client#execute directly in the few places where it's needed.

import org.elasticsearch.action.support.nodes.NodesOperationRequestBuilder;
import org.elasticsearch.client.internal.ElasticsearchClient;

public class ClusterFormationInfoRequestBuilder extends NodesOperationRequestBuilder<
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise this doesn't seem necessary, you may as well just construct the request directly.

* This will fetch the information from the requested nodes, and it'll likely be sourced from
* {@link Coordinator}.
*/
public class TransportClusterFormationInfoAction extends TransportNodesAction<
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that this probably can't be a TransportNodesAction: we're using this action when not a proper cluster so there's nothing holding the connections to other nodes open, so we'll have to explicitly acquire/release the connection yourself. See e.g. how Coordinator#handleJoinRequest and HotThreadsLoggingLagListener#onLagDetected wrap their transport activities in a call to TransportService#connect and release the connection afterwards. TransportNodesAction won't do that, we probably need to call TransportService#sendRequest directly.

Also it won't work to use the current cluster state to resolve node IDs since there is no cluster when using this. We'll need to use the exact DiscoveryNode instances that discovery tells us about.

@masseyke
Copy link
Copy Markdown
Member

I think we can now close this one in favor of #87306.

@andreidan andreidan closed this Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Health Issues for the health report API >feature Team:Clients Meta label for clients team Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v8.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants