Online DDL via VReplication#7419
Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…parser Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…t on this branch's radar Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ap query Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…writes to table, waiting for updated pos, renaming tables, releasing table, releasing locks Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
rohit-nayak-ps
left a comment
There was a problem hiding this comment.
This is really great :-)
Left some comments/queries: in general the functionality looks good.
| "strategy": sqltypes.StringBindVariable(string(schema.DDLStrategyPTOSC)), | ||
| // isVReplMigrationReadyToCutOver sees if the vreplication migration has completed the row copy | ||
| // and is up to date with the binlogs. | ||
| func (e *Executor) isVReplMigrationReadyToCutOver(ctx context.Context, s *VReplStream) (isReady bool, err error) { |
There was a problem hiding this comment.
For low write qps it is possible that there are no binlog events for a table for a long time. If I read this logic correctly, in such cases the Online DDL workflow may not cut over automatically.
There was a problem hiding this comment.
Interesting. Isn't there a heartbeat mechanism for vreplication though?
There was a problem hiding this comment.
time_updated is updated by the heartbeat. However transaction_timestamp is only when a new binlog event is seen, not for a heartbeat. This is quite rare since it means there is no writes on the source at all. But it has been reported by some users. Also if a database is taken "offline" for some reason (in the sense that no app is writing to it) and the user starts a workflow this can happen.
Practically we may not need to address this now.
There was a problem hiding this comment.
removed evaluation of transaction-timestamp. I want to be on the safer side.
There was a problem hiding this comment.
timeUpdated will always be current because of the heartbeat so the moment the copy state is empty it will say that the migrate workflow is ready for cutover. Am I missing something?
There was a problem hiding this comment.
Am I missing something?
Ermm... Not sure why you're asking? Am I missing something? 😅
There was a problem hiding this comment.
To elaborate a bit, though I'm still unsure what you meant:
timeUpdated will always be current
Unless there's some lag, or vreplication is stopped
so the moment the copy state is empty
We also validate that pos in non-empty. Were you suggesting the migration might cut-over before starting copying rows?
There was a problem hiding this comment.
Having discussed this: meanwhile checking transaction_timestamp is the safer approach. time_updated gets update regardless of incoming events.
So the risk of transaction_timestamp is in completely stale servers. However, it's enough that some table is updated and the problem goes away. But if the server is completely stale, cut-over will not happen.
To solve that we can either:
- force injection of some dummy change, or
- compare GTID pos (likely not a good idea, because
_vttables are getting changed, affecting GTID, but scrubbed in the streamer) - modify
streamerto include more information that we can use on target side.
| // isVReplMigrationReadyToCutOver sees if the vreplication migration has completed the row copy | ||
| // and is up to date with the binlogs. | ||
| func (e *Executor) isVReplMigrationReadyToCutOver(ctx context.Context, s *VReplStream) (isReady bool, err error) { | ||
| // Check all the cases where migration is still running: |
There was a problem hiding this comment.
When an online ddl completes, vttablets and vtgates will need to reload their schema as per the current mechanisms, before they see the changes made by the ddl correct?
There was a problem hiding this comment.
Right. I issue a ReloadSchema before running the migration. Should I run ReloadSchema immediately after completing the migration, or should I actually do that during the cut-over (when tables have been switched, and writes are still blocked)?
There was a problem hiding this comment.
Doing it when writes are blocked might be better since binlog events after the write will then see the new schema. Otherwise we may have a race: new events for the altered table may be processed before the new schema is reloaded.
But the ReloadSchema is just for this tablet, right?
There was a problem hiding this comment.
if err := tmClient.ReloadSchema(ctx, tablet.Tablet, ""); err != nil {
return err
}correct
There was a problem hiding this comment.
Just clarifying, that this goes through gRPC; so I want to make sure this isn't an expensive event, because meanwhile we're blocking (rather -- rejecting) writes to the table, so it's a critical point in time.
There was a problem hiding this comment.
Define expensive...: ReloadSchema can take a while since it parses the entire db schema. It has been speeded up recently but for a large number of tables it will probably take a non-trivial amount of time. I would think, still in the order of milliseconds, but not entirely sure.
There was a problem hiding this comment.
non-trivial amount of time.
milliseconds is just fine. Would it take more than 1 second (I'd define "expensive" at 1 second) ? e.g. on a schema with 10,000 tables? Is it possible to ReloadSchema for just one table?
There was a problem hiding this comment.
(having discussed this in sync with @rohit-nayak-ps) reloading schema can take a substantial time if many thousands of tables are involved. For now, we issue an async ReloadSchema (with goroutine). In the future, we will look into reloading a single table.
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…tion-online-ddl Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
I have merged #7492 here. This adds a new |
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
7e2a074 introduces dummy statement injection for when we run vreplication based online DDL, and the server is utterly stale. In this situation, The solution is hacky: if we notice |
|
@rohit-nayak-ps actually, re: discussion about cut-over and stale server -- the tests disagree. The tests are happy to cut-over when the copy is complete, even when there's no traffic on the table. Did some local tests, inserted two rows on a table, then slept, then migrated -- cut-over took place. |
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
I notice that it is impossible to |
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
|
@sougou confirmed changes to I've reverted 7e2a074 ; there is no need to inject dummy transaction. The existing mechanism works even on stale servers. |
|
Looking at This PR, meanwhile, is good to review, notwithstanding the fact that |
|
I should clarify. Adding a |
Ready for review
This PR implements a "forward" alter table via VReplication.
General design described in #6926 (comment), which this PR implements.
Notable bullets about the changes in this PR:
The main logic of this PR is in
go/vt/vttablet/onlineddl/vrepl.goandgo/vt/vttablet/onlineddl/executor.gogo/vt/vttablet/onlineddl/executor.gocan now invoke, cancel and terminate VReplication based migrations. Some of this management is symmetric to how we invoke, cancel and terminategh-ostorpt-osc, but there are also significant differences:gh-ostandpt-osccan only work within the context of this tablet (invoked as an OSExec), VReplication migrations can execute by one tablet and continued on another, even in face of failover.gh-ostandpt-oschave self-logic to run the migration. VReplication needs to be told what exactly it should run. The tablet knows about aalter tablestatement, and needs to create a matching rule/filter query for VReplication.gh-ostandpt-oschave self-logic to cut-over. Vreplication does not.Executoridentifies the occasion for cut-over and implements the cut-over logic, which entails stopping writes to original tables, waiting for pos, stopping VReplication, replacing tables.Imported a bunch of code from https://github.com/github/gh-ost. These are found under go/vt/vttablet/onlineddl/vrepl. The kind of functionality in those files:
We don't strictly need all the above as-is. We can replace parsing with vitess's parsing. We don't need to track datatypes because vreplication does. But I copied&pasted those files because they are mature and stable and get the work done.
I intend to refactor them later on and remove redundant/unrelated code, or replace with existing Vitess functionality. But I'd like to do so on a separate PR so as to focus on the main overall functionality here.
Split onlineddl/endtoend tests into
endtoend_ghostandendtoend_vrepl. They are mostly similar but with some changes (like how do you throttle a migration). The tests are OK-ish but need to be more elaborate. I again wish to follow up in a future PR as I am yet to design a satisfactoryendtoendtest for schema migrations and failure scenarios.Due to some earlier attempt to use
wranglerinsidetabletserver, I created a newinterfacecalledvexec.Executor, and refactored ddlExecutor to bevexec.Executorinstead ofonlineddl.Executor. I ended up not usingwranglerinsidetabletserverbut I liked the refactor and kept it.initial PR comment when this was Work In Progress:
Related Issue(s)
Checklist
Deployment Notes
Impacted Areas in Vitess
Components that this PR will affect: