Creating a transport action for the CoordinationDiagnosticsService#87984
Conversation
|
Hi @masseyke, I've created a changelog YAML for you. |
…f github.com:masseyke/elasticsearch into feature/health-api-master-stability-transport-action
|
Pinging @elastic/es-data-management (Team:Data Management) |
|
@elasticmachine update branch |
andreidan
left a comment
There was a problem hiding this comment.
Thanks for adding this Keith.
Left a few suggestions
| protected void doExecute(Task task, CoordinationDiagnosticsAction.Request request, ActionListener<Response> listener) { | ||
| listener.onResponse( | ||
| new Response( | ||
| new CoordinationDiagnosticsService(clusterService, coordinator, masterHistoryService).diagnoseMasterStability( |
There was a problem hiding this comment.
shall we inject the service or create it once in the constructor? are there thread-safety concerns if we do so? (I'd think there shouldn't be as it's only reading things here)
There was a problem hiding this comment.
Yeah I think you're right -- I don't think there would be any thread-safety problems.
There was a problem hiding this comment.
And on second look, I'm not sure why I wasn't injecting that in the first place.
| import static org.elasticsearch.cluster.coordination.CoordinationDiagnosticsService.CoordinationDiagnosticsResult; | ||
|
|
||
| /** | ||
| * This action exposes CoordinationDiagnosticsService.diagnoseMasterStability() so that a node can get a remote node's view of |
There was a problem hiding this comment.
nit: {@link CoordinationDiagnosticsService#diagnoseMasterStability}
| } | ||
|
|
||
| /** | ||
| * This transport action calls the CoordinationDiagnosticsService on a remote node. |
There was a problem hiding this comment.
nit: This doc is rather misleading to me. The transport action exposes the functionality via the transport service (as opposed to calling a service on a remote node)
| YELLOW((byte) 2), | ||
| RED((byte) 3); | ||
|
|
||
| private final byte value; |
There was a problem hiding this comment.
I don't think this value field is needed (I think it adds a bit of confusion).
Enums have ordinals already.
Let's remove the value field and use in.readEnum and out.writeEnum to put it on the wire (it'll only write the enum ordinal).
What do you think?
There was a problem hiding this comment.
Huh. Somehow I didn't realize that those methods existed. Switching to use them now.
|
@elasticmachine update branch |
andreidan
left a comment
There was a problem hiding this comment.
LGTM, thanks for iterating on this Keith
Left a minor suggestion
| public TransportAction( | ||
| TransportService transportService, | ||
| ActionFilters actionFilters, | ||
| CoordinationDiagnosticsService coordinationDiagnosticsService |
|
|
||
| @Override | ||
| protected void doExecute(Task task, CoordinationDiagnosticsAction.Request request, ActionListener<Response> listener) { | ||
| listener.onResponse(new Response(coordinationDiagnosticsService.diagnoseMasterStability(request.explain))); |
There was a problem hiding this comment.
Shall we use ActionRunnable.wrap here to make sure that a potential exception from diagnoseMasterStability gets indeed propagated to listener.onFailure?
There was a problem hiding this comment.
I think that's already handled for us in the parent class here right? https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/action/support/TransportAction.java#L79. I'll confirm that that is where it is called from.
There was a problem hiding this comment.
Ah you're right. TIL. Thanks for looking into this!
This exposes the CoordinationDiagnosticsService (#87672) through a transport action so that it can be called remotely as

part of the health API in the event that: (1) there has been no master recently, (2) there are master-eligible nodes in the cluster, (3) none are elected, and (4) the current node is not master eligible. In the diagram below, this action will be used in the 1.2.2.3 branch (it is the long arrow back to branch 1). The logic that actually uses it will be within the CoordinationDiagnosticsService, and will be in a follow-up PR. Requests for this action will be sent directly to the remote node, and the assumption is that the cluster that likely does not have a master node.