-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
During the cut-over operation gh-ost issues a lock tables on the tables before they're renamed. After the rename an unlock tables is issued to unlock the tables
Today, if gh-ost pauses/freezes (process remains running but is unresponsive due to a host problem) between the lock tables and unlock tables, the locks are not released. We haven't explained what could cause the host running gh-ost to essentially freeze execution, but we had this occur in production and locks were never released until the MySQL wait_timeout (for killing idle connections)
This theoretically can be reproduced by:
- Adding a pause after the
lock tablesstep in the cut-over (hand-wavy) - Freeze the
gh-ostprocess withkill -TSTP [pid]orkill -STOP [pid] - Observe the table locks never getting released until
wait_timeout(default 30 minutes)
To address this, I plan to shorten the wait_timeout of the applier MySQL session during cut-over only, as this is the only time where a short idle timeout is advantageous. After the cut-over the wait_timeout for the session will be restored to the server default