Move RestClusterStateAction Response Serialization to Management Pool#65843
Move RestClusterStateAction Response Serialization to Management Pool#65843original-brownbear merged 5 commits intoelastic:masterfrom original-brownbear:cs-response-on-management-thread
Conversation
Responding with the full cluster state implies serializing the full cluster state on the IO thread. In case of very large cluster states this serialization is not a trivial action and can take multiple seconds so it shouldn't be happening on a transport thread.
|
Pinging @elastic/es-distributed (Team:Distributed) |
DaveCTurner
left a comment
There was a problem hiding this comment.
I'm fairly convinced we should move the work somewhere that's not a transport thread, but not convinced that the MANAGEMENT threadpool is the right place, for similar reasons to those we discussed in #51992.
I also worry that this might have an impact on CCR which uses (well-behaved) cluster state requests for updating the metadata on the follower. This change introduces an interaction between CCR and potentially-slow monitoring activity.
I left a couple of other questions too.
.../src/main/java/org/elasticsearch/action/admin/cluster/state/TransportClusterStateAction.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/elasticsearch/action/admin/cluster/state/TransportClusterStateAction.java
Outdated
Show resolved
Hide resolved
| if (clusterStateRequest.local() == false && | ||
| threadPool.relativeTimeInMillis() - startTimeMs > | ||
| clusterStateRequest.masterNodeTimeout().millis()) { | ||
| throw new ElasticsearchTimeoutException("Timed out getting cluster state"); |
|
Linking #51992 in this comment too since apparently a link in a review comment doesn't create a reverse link on the target issue. |
I agree with those reasons. Then again I don't see this being much different from #57937 ... I have no scientific way of determining whether or not it yet again is just the REST handler that is slow or the transport action as a whole. But ... I can't see how a O(100MB) cluster state would serialize as quickly as we would want it to for a transport layer action so I went with the change for the transport layer as well. |
Indeed. Could we give CCR its own action for getting the index metadata of a single index? It currently does some follower-side retries that could reasonably move onto the leader too. That way we're not mixing up random monitoring/diagnostics/abuse with well-behaved internal stuff and I'd be more comfortable with pushing this onto a very restricted threadpool. |
++ sounds good. Maybe do that in a separate PR (since it's one of these nasty BwC things and that's going to take a little more time to do :)) and reduce this PR to just the REST layer change for now (which will probably in the real world like with the mappings be the real world thing causing the warning logs anyway). |
|
Sure, sounds good to me. |
| assertThat(future2.isDone(), is(true)); | ||
| }); | ||
| ClusterStateResponse response = future2.actionGet(); | ||
| response = future2.get(10L, TimeUnit.SECONDS); |
There was a problem hiding this comment.
I kept this test cleanup even though it's not necessary now with the transport action changes reverted because busy asserting on futures was just weird ...
Alright, all done, all green now :) |
DaveCTurner
left a comment
There was a problem hiding this comment.
LGTM, I left one small optional request.
| builder.endObject(); | ||
| return new BytesRestResponse(RestStatus.OK, builder); | ||
| } | ||
| }.onResponse(response))); |
There was a problem hiding this comment.
Seems a bit weird to create the listener just to immediately complete it. Why not inline this?
There was a problem hiding this comment.
Did the same for the mappings, so I didn't have to duplicate the logic in RestBuilderListener but now that we have two spots that follow this pattern I think we can refactor this in a follow-up and extract the logic for writing out the response somewhere so we don't have to do this.
|
Oh yes and the PR title and description aren't accurate any more either. |
|
Thanks David, fixed the description/title :) |
Moving the cluster state response serialization to the management thread just like we did for the mappings response in #57937 since it's a potentially very large and slow to serialize response.