-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Final-finally-finalizationally, here's an asynchronous, safe, atomic cut-over phase.
This solution doesn't cause "table outage" (as in #65).
Here are the steps for a safe, atomic cut-over:
The solution we offer is now based on two connections only (as opposed to three, in the optimistic approach). "Our" connections will be C10, C20. The "normal" app connections are C1..C9, C11..C19, C21..C29.
-
Connections C1..C9 operate on
tblwith normal DML:INSERT, UPDATE, DELETE -
Connection C10:
CREATE TABLE tbl_old (id int primary key) COMMENT='magic-be-here' -
Connection C10:
LOCK TABLES tbl WRITE, tbl_old WRITE -
Connections C11..C19, newly incoming, issue queries on
tblbut are blocked due to theLOCK -
Connection C20:
RENAME TABLE tbl TO tbl_old, ghost TO tbl- This is blocked due to the
LOCK, but gets prioritized on top connections C11..C19 and on top C1..C9 or any other connection that attempts DML ontbl
- This is blocked due to the
-
Connections C21..C29, newly incoming, issue queries on
tblbut are blocked due to theLOCKand due to theRENAME, waiting in queue -
Connection C10: checks that C20's
RENAMEis applied (looks for the blockedRENAMEinshow processlist) -
Connection 10:
DROP TABLE tbl_old
Nothing happens yet;tblis still locked. All other connections still blocked. -
Connection 10:
UNLOCK TABLESBAM! The
RENAMEis first to execute, ghost table is swapped in place oftbl, then C1..C9, C11..C19, C21..C29 all get to operate on the new and shinytbl
Some notes
- We create
tbl_oldas a blocker for a premature swap - It is allowed for a connection to
DROPa table it has under aWRITE LOCK - A blocked
RENAMEis always prioritized over a blockedINSERT/UPDATE/DELETE, no matter who came first
What happens on failures?
Much fun. Just works; no rollback required.
- If C10 errors on the
CREATEwe do not proceed. - If C10 errors on the
LOCKstatement, we do not proceed. The table is not locked. App continues to operate as normal. - If C10 dies just as C20 is about to issue the
RENAME:- The lock is released, the queries C1..C9, C11..C19 immediately operate on
tbl. - C20's
RENAMEimmediately fails becausetbl_oldexists. - The entire operation is failed, but nothing terrible happens; some queries were blocked for some time is all. We will need to retry everything
- The lock is released, the queries C1..C9, C11..C19 immediately operate on
- If C10 dies while C20 is blocked on
RENAME: Mostly similar to the above. Lock released, then C20 fails theRENAME(becausetbl_oldexists), then all queries resume normal operation - If C20 dies before C10 drops the table, we catch the error and let C10 proceed as planned:
DROP,UNLOCK. Nothing terrible happens, some queries were blocked for some time. We will need to retry - If C20 dies just after C10
DROPs the table but before the unlock, same as above. - If both C10 and C20 die, no problem:
LOCKis cleared;RENAMElock is cleared. C1..C9, C11..C19, C21..C29 are free to operate ontbl.
No matter what happens, at the end of operation we look for the ghost table. Is it still there? Then we know the operation failed, "atomically". Is it not there? Then it has been renamed to tbl, and the operation worked atomically.
A side note on failure is the matter of cleaning up the magic tbl_old. Here this is a matter of taste. Maybe just let it live and avoid recreating it, or you can drop it if you like.
Impact on app
App connections are guaranteed to be blocked, either until ghost is swapped in, or until operation fails. In the former, they proceed to operate on the new table. In the latter, they proceed to operate on the original table.
Impact on replication
Replication only sees the RENAME. There is no LOCK in the binary logs. Thus, replication sees an atomic two-table swap. There is no table-outage.