PR #6503 changed Nexus to attempt to automatically restart instances which are in the Failed state. Now that we do this, we should probably change the allowable instance state transitions to permit a user to stop an instance that is Failed, as a way to say "stop trying to restart this instance" (as Stopped instances are not restarted).
This would have slightly different semantics from changing the instance's auto-restart policy using a future instance-reconfigure API. Stopping a Failed instance would mean "stop trying to restart this now; if it is later started and then transitions to Failed again, continue using whatever its auto-restart policy is", while changing the auto-restart policy would mean "don't try to automatically restart this even if it's eventually restarted again".1
If we do this, we should definitely also make SagaUnwound instances appear as Failed rather than Stopped, as I discussed in #6638 (comment).
PR #6503 changed Nexus to attempt to automatically restart instances which are in the
Failedstate. Now that we do this, we should probably change the allowable instance state transitions to permit a user to stop an instance that isFailed, as a way to say "stop trying to restart this instance" (asStoppedinstances are not restarted).This would have slightly different semantics from changing the instance's auto-restart policy using a future instance-reconfigure API. Stopping a
Failedinstance would mean "stop trying to restart this now; if it is later started and then transitions toFailedagain, continue using whatever its auto-restart policy is", while changing the auto-restart policy would mean "don't try to automatically restart this even if it's eventually restarted again".1If we do this, we should definitely also make
SagaUnwoundinstances appear asFailedrather thanStopped, as I discussed in #6638 (comment).Footnotes
Unless, of course, the user changes the auto-restart policy again. ↩