This is a meta-issue to track and discuss the ILM steps that should be retryable and under which circumstances. This relates to the efforts on making the rollover action retryable (#44135 ) and the more general strategy ILM will employ in order to make actions more resilient and self-healing ( #42824 ).
Below are all the steps we use, grouped by actions (as we'll likely not treat steps differently depending in which actions they occur they are listed only once under the first action, ordered alphabetically, they're used in). The marker Terminal/Error steps are not listed.
Steps
AllocateAction
DeleteAction
ForceMergeAction
FreezeAction
RolloverAction
ShrinkAction
UnfollowAction
Scope
Any action/step that can be made to be re-tried after a failure.
Duration
~ 2 months
This is a meta-issue to track and discuss the ILM steps that should be retryable and under which circumstances. This relates to the efforts on making the rollover action retryable (#44135 ) and the more general strategy ILM will employ in order to make actions more resilient and self-healing ( #42824 ).
Below are all the steps we use, grouped by actions (as we'll likely not treat steps differently depending in which actions they occur they are listed only once under the first action, ordered alphabetically, they're used in). The marker Terminal/Error steps are not listed.
Steps
AllocateAction
DeleteAction
ForceMergeAction
FreezeAction
RolloverAction
ShrinkAction
getNextStepKey()depending on the outcome of a defined predicate. It performs no changes to the cluster stateUnfollowAction
Scope
Any action/step that can be made to be re-tried after a failure.
Duration
~ 2 months