VDiff2: Support Resuming VDiffs#10497
Merged
mattlord merged 52 commits intovitessio:mainfrom Jun 27, 2022
Merged
Conversation
And fix bug that caused the row estimate to always be 0. Signed-off-by: Matt Lord <mattalord@gmail.com>
Still need to actually resume ... Signed-off-by: Matt Lord <mattalord@gmail.com>
What's left is adding the lastPK results to the SELECT used on the source and target tablets for retrieving the rows to compare. Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
df3b50b to
cb8a92e
Compare
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
In the case we processed no new rows the lastpk should remain the same. Before this we ended up clearing the lastpk out when doing a resume that had no new rows to examine, thus causing the subsequent resume to diff all of the rows again. Signed-off-by: Matt Lord <mattalord@gmail.com>
And add invariant check in updating table status Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Remove debugging code used during development. Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
0b38f0e to
3c26cc1
Compare
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Turns out that log does not support this verb. So we can stick to using it with fmt instead. Signed-off-by: Matt Lord <mattalord@gmail.com>
This provides accurate progress info during the resume run Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
And update release notes for new output Signed-off-by: Matt Lord <mattalord@gmail.com>
Since this was not working before this PR. :-) Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
| tables: "customer,Lead,Lead-1", | ||
| resume: true, | ||
| resumeInsert: `insert into customer(cid, name, typ) values(12345678, 'Testy McTester', 'soho')`, | ||
| testCLIErrors: true, // test for errors in the simplest test case |
Member
There was a problem hiding this comment.
Since the CLI test is an independent one and called only once, we should probably just take it out of testWorkflow() and call it as a separate sub-test.
Member
Author
There was a problem hiding this comment.
I had initially gone that route too, but then there's no workflows to test against — as testWorkflow() does SwitchWrites and Complete at the end — and the only errors you get are a result of the fact that the workflow does not exist. :-) In doing that, I also realized there's some follow-up work to do in handling those error cases well (today you get an error about a missing key as the VDiff metadata is there but the backing vreplication workflow is not).
| IF(vdt.mismatch = 1, 1, 0) as has_mismatch, vdt.report as report | ||
| from _vt.vdiff as vd inner join _vt.vdiff_table as vdt on (vd.id = vdt.vdiff_id) | ||
| where vdt.vdiff_id = %d` | ||
| // sqlUpdateVDiffState has a penultimate placeholder for any additional columns you want to update, e.g. `, foo = 1` |
Signed-off-by: Matt Lord <mattalord@gmail.com>
3 tasks
mattlord
added a commit
to planetscale/vitess
that referenced
this pull request
Jun 27, 2022
And fix a number of bugs discovered related to incorrect VDiff summary handling and other more minor things. Signed-off-by: Matt Lord <mattalord@gmail.com>
mattlord
added a commit
that referenced
this pull request
Jun 27, 2022
* CherryPick: VDiff2: Support Resuming VDiffs (#10497) And fix a number of bugs discovered related to incorrect VDiff summary handling and other more minor things. Signed-off-by: Matt Lord <mattalord@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This work allows you to explicitly resume an existing
VDiffworkflow using the newresume <UUID>VDiffaction. VDiff will then resume, picking up where it left off and comparing the records where the Primary Key column(s) are greater than the last record processed — with the progress and other status information saved when the run ends. This allows you to:VDiffthat may have encountered an ephemeral error (adding an auto-retry on error will come later)MoveTablesfinishes the initial copy phase and then again just beforeSwitchTraffic)In order to better support this for users and testing, we also now show the
CompletedAttimestamp in the VDiff summary, when it does complete w/o error, and we also show theRowsComparedduring the run to accurately reflect the work done on resuming.Click here for an example manual test and example results
Related Issue(s)
Checklist