Online DDL cutover enhancements#18423
Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #18423 +/- ##
==========================================
- Coverage 67.51% 67.49% -0.02%
==========================================
Files 1607 1607
Lines 262706 262768 +62
==========================================
- Hits 177370 177360 -10
- Misses 85336 85408 +72 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
There was a problem hiding this comment.
Pull Request Overview
This PR enhances the Online DDL cut-over by killing queries during the RENAME phase, tightening cut-over timeouts, improving logging with migration UUIDs, and treating very small force-cut-over thresholds as immediate.
- Extend
killTableLockHoldersAndAccessorsto accept excluded connection IDs and include UUID in logs - Reduce excessive lock-wait timeouts and simplify force-cut-over logic for “immediate” thresholds
- Update tests to cover new force-cut-over behavior and fix test names
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| executor_test.go | Renamed and added test cases for force-cut-over threshold logic |
| executor.go | Added UUID to kill logic, skip exclusions, refined timeouts, and adjusted context cancellation during rename |
Comments suppressed due to low confidence (1)
go/vt/vttablet/onlineddl/executor_test.go:137
- [nitpick] The test name is ambiguous: it refers to "microsecond" but the threshold is 1ms. Consider renaming to clarify that force-cut-over at or below 1ms is immediate.
name: "microsecond, ready irrespective of sinceReadyToComplete",
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
* origin/master: bugfix: Fix impossible query for UNION (vitessio#18463) fix topo use in local_example (vitessio#18357) fix: update go-upgrade tool to check patch number (vitessio#18252) (vitessio#18402) Update MAINTAINERS.md and CODEOWNERS (vitessio#18462) Add logging to binlog watcher actions (vitessio#18264) `schemadiff`: `RelatedForeignKeyTables()` (vitessio#18195) `vtorc`: allow recoveries to be disabled from startup (vitessio#18005) Fix `vttablet` not being marked as not serving when MySQL stalls (vitessio#17883) make xtrabackup ShouldDrainForBackup configurable (vitessio#18431) Reset in-memory sequence info on vttablet on UpdateSequenceTables request (vitessio#18415) Fix watcher storm during topo outages (vitessio#18434) Online DDL: resume vreplication after cut-over/RENAME failure (vitessio#18428) Online DDL cutover enhancements (vitessio#18423) VStreamer: change in filter logic (vitessio#18319) Online DDL metrics: `OnlineDDLStaleMigrationMinutes` (vitessio#18417) Signed-off-by: Morgan Tocker <tocker@gmail.com>
Description
Several enhancements to the Online DDL cut-over logic:
RENAMEis being applied. Without this, there's a race condition where a long running query could start running just after queries are killed and right before theRENAMEstarts running.When killing queries & transactions, we skip the connection IDs of the cut-over related queries themselves.
5*onlineDDL.CutOverThreshold*4value which evaluates to5minon a15stimeout.--force-cut-over-aftervalue is<= 1mswe consider it as "immediate" even if we somehow measure the time-since-ready to be less than that.Related Issue(s)
Checklist
Deployment Notes