Skip to content

Conversation

@zuiderkwast
Copy link
Contributor

@zuiderkwast zuiderkwast commented Nov 30, 2021

To avoid data loss, this commit adds a grace period for lagging replicas to
catch up the replication offset.

Done:

  • Wait for replicas when shutdown is triggered by SIGTERM and SIGINT.

  • Wait for replicas when shutdown is triggered by the SHUTDOWN command. A new
    blocked client type BLOCKED_SHUTDOWN is introduced, allowing multiple clients
    to call SHUTDOWN in parallel.
    Note that they don't expect a response unless an error happens and shutdown is aborted.

  • Log warning for each replica lagging behind when finishing shutdown.

  • CLIENT_PAUSE_WRITE while waiting for replicas.

  • Configurable grace period 'shutdown-timeout' in seconds (default 10).

  • New flags for the SHUTDOWN command:

    • NOW disables the grace period for lagging replicas.

    • FORCE ignores errors writing the RDB or AOF files which would normally
      prevent a shutdown.

    • ABORT cancels ongoing shutdown. Can't be combined with other flags.

  • New field in the output of the INFO command: 'shutdown_in_milliseconds'. The
    value is the remaining maximum time to wait for lagging replicas before
    finishing the shutdown. This field is present in the Server section only
    during shutdown.

Not directly related:

  • When shutting down, if there is an AOF saving child, it is killed even if AOF
    is disabled. This can happen if BGREWRITEAOF is used when AOF is off.

  • Client pause now has end time and type (WRITE or ALL) per purpose. The
    different pause purposes are CLIENT PAUSE command, failover and
    shutdown. If clients are unpaused for one purpose, it doesn't affect client
    pause for other purposes. For example, the CLIENT UNPAUSE command doesn't
    affect client pause initiated by the failover or shutdown procedures. A completed
    failover or a failed shutdown doesn't unpause clients paused by the CLIENT
    PAUSE command.

Notes:

  • DEBUG RESTART doesn't wait for replicas.

  • We already have a warning logged when a replica disconnects. This means that
    if any replica connection is lost during the shutdown, it is either logged as
    disconnected or as lagging at the time of exit.

Fixes #9693

@zuiderkwast zuiderkwast marked this pull request as ready for review November 30, 2021 18:20
@zuiderkwast zuiderkwast requested a review from yossigo December 1, 2021 11:24
Copy link
Contributor

@bjosv bjosv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@oranagra
Copy link
Member

oranagra commented Dec 9, 2021

@zuiderkwast it occurred to me that there's probably an opportunity for some semi-related cleanup.
when we get SIGTERM during loading, we immediately exit.
IIRC this code was written before prepareForShutdown had the NOSAVE option (which was added for the SHUTDOWN command).
Now that it has it, i think it's better to go though prepareForShutdown, and pass the appropriate flags (NOSAVE / FORCE).
There are no should be no fork children at that time, but it's still a good idea to go though other cleanup steps (e.g. modules).

while we are no the subject, i did notice that we're terminating the AOF child only if AOF is enabled, but currently it is also possible to do BGREWRITEAOF when it's disabled (see #9794), so i think the child should be stopped.
So, if you're already working in that area, we can make additional cleanups.

This reverts commit ee6ee3a.

The reverted commit broke the following test cases:

*** [err]: diskless slow replicas drop during rdb pipe in tests/integration/replication.tcl
log message of '"*Diskless rdb transfer, done reading from pipe, 1 replicas still up*"' not found in ./tests/tmp/server.2561.79/stdout after line: 60 till line: 80
*** [err]: diskless all replicas drop during rdb pipe in tests/integration/replication.tcl
log message of '"*Diskless rdb transfer, last replica dropped, killing fork child*"' not found in ./tests/tmp/server.2561.79/stdout after line: 110 till line: 131
@zuiderkwast
Copy link
Contributor Author

@redis/core-team May I have your attention again? Two more things added in this PR:

  • Client pause per purpose (failover, shutdown, client pause command)
  • SHUTDOWN ABORT

@oranagra oranagra merged commit 45a155b into redis:unstable Jan 2, 2022
@oranagra
Copy link
Member

oranagra commented Jan 2, 2022

@zuiderkwast thank you.
I suppose there are some docs that will benefit from an update about this change.
obviously the SHUTDOWN command (which must refer to the new config, and new args), but maybe some other places too?

@zuiderkwast
Copy link
Contributor Author

@oranagra Thank YOU! Yes, I have updated the page about signal handling. It's merged already in redis/redis-doc#1711.

@zuiderkwast zuiderkwast deleted the graceful-shutdown branch March 17, 2022 12:33
oranagra pushed a commit that referenced this pull request Mar 20, 2022
…cked (#10440)

fix #10439. see #9872
When executing SHUTDOWN we pause the client so we can un-pause it
if the shutdown fails.
this could happen during the timeout, if the shutdown is aborted, but could
also happen from withing the initial `call()` to shutdown, if the rdb save fails.
in that case when we return to `call()`, we'll crash if `c->cmd` has been set to NULL.

The call stack is:
```
unblockClient(c)
replyToClientsBlockedOnShutdown()
cancelShutdown()
finishShutdown()
prepareForShutdown()
shutdownCommand()
```

what's special about SHUTDOWN in that respect is that it can be paused,
and then un-paused before the original `call()` returns.
tests where added for both failed shutdown, and a followup successful one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus state:needs-doc-pr requires a PR to redis-doc repository

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[BUG] Data loss when performing a graceful master shutdown under high load

8 participants