-
Notifications
You must be signed in to change notification settings - Fork 4.1k
server: provide the ability to cancel a server shutdown after it has started #76423
Description
Requested/mentioned by @jeffswenson in this comment.
Also discussed previously with @mwang1026.
cc @chrisseto @jtsiros
Expressed use case:
When the k8s operator decides to reduce the number of pods allocated to a serverless cluster, it marks the surplus pods in a draining state. The sqlproxy maintains existing connections to draining pods, but it will not send new connections to them. If the cluster's utilization increases, the pods are put back in a serving state. If there was a way to cancel CRDB drains, we could notify the sql server it is draining as soon as the operator decides to reduce capacity. With one way drains, the operator needs to wait to send the drain until it is ready to remove the pod.
Let's see what it would take.
Background
A server shutdown is a process with multiple phases.
Some of these phases are irreversible, once they start we cannot abort the shutdown anymore.
However, the irreversible phases are happening at the end:
-
mark server as non-ready for LBs, and non-ready for most incoming RPC traffic.
-
wait for the time configured in
drain_waitfor LBs to direct traffic away, -
wait for at most
query_waitfor active local queries to finish. -
cancel all current sessions, wait for at most 1 second for the client conns to be closed.
-
stop accepting distSQL query requests from other nodes
-
wait for at most
query_waitfor active processing on behalf of other nodes to finish.
-
drain the SQL table leases -- this is irreversible without re-initializing the SQL layer
-
(on KV-enabled nodes only) mark the KV liveness record as draining, which will prevent other nodes from contacting this node until it restarts -- this is irreversible without KV re-initialization
-
(on KV-enabled nodes only) mark the KV stores as draining, which will prevent other nodes from rebalancing replicas to this node until it restarts -- this is irreversible without KV re-initialization
-
if the drain was issued as part of process shutdown, that's where we also start tearing down the entire process via the
stopperinstance. this is irreversible without a process restart -
then, again if we are aiming for process shutdown, that's where the process self-terminates.
Possible approach
As can be seen, only steps 7 and later are truly irreversible. However, they are also rather short (less than a few seconds). This means that after step 7 is reached, it doesn't pay off much to cancel the drain as opposed to simply restarting the process gracefully.
So we could imagine extending the drain logic so that the process to run steps 1-6 can be cancelled after it has started.
To achieve this, there are 4 design complexities:
- we need a new "shutdown backtrack" architecture that reverts the step in inverse order, but just as far as what was done already and not more.
- we need to refactor the entire shutdown code so that it's captured in go interfaces that can be mocked, otherwise unit testing will be close to impossible.
- each of the steps 3-6 in the list above, once they have started, is hard to cancel until the configured time has elapsed. However, for orchestration purposes, we'd like the "shutdown backtrack" operation to be as fast as possible. This means that we'll need to tweak the waiting logic to wait either the configured amount of time, or until the shutdown is canceled. Given the complexity of the current shutdown/wait code, this will be technically challenging to implement for steps 4, 5, 6.
- we will need to be very careful about how this "shutdown backtrack" interacts with the decommissioning process.
Jira issue: CRDB-13111