-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Document and/or gracefully fail creating invalid ILM policies in mixed-cluster #37085
Description
Problem Context
While operating a mixed-version cluster where the master node version contains new lifecycle actions
that previously didn't exist, there can be non-friendly failures.
Creating policies with newly introduced actions will result in success on the master, but then
can either fail when hitting ILM apis from older coordinating nodes, or running unsupported actions
on other nodes.
Two scenarios:
- Requesting policy from older node:
{"error":{"root_cause":[{"type":"transport_serialization_exception","reason":"Failed to deserialize response from handler [org.elasticsearch.transport.TransportService$Con
textRestoreResponseHandler]"}],"type":"transport_serialization_exception","reason":"Failed to deserialize response from handler [org.elasticsearch.transport.TransportServi
ce$ContextRestoreResponseHandler]","caused_by":{"type":"illegal_argument_exception","reason":"Unknown NamedWriteable [org.elasticsearch.xpack.core.indexlifecycle.Lifecycle
Action][freeze]"}},"status":500}
- Executing new lifecycle action that hits APIs that do not exist in older nodes
Solution
There needs to be more guidance on how to best manage new policies while operating mixed clusters. It should be recommended to disable ILM while a rolling-upgrade is in progress. This would avoid these unintuitive errors.
In addition to better guidance in documentation, it may make sense to disallow creation of these invalid policies while in mixed-clusters and explain the situation to the user.