Overview of the Issue
When initializing a VDiff, the target primary tablet will panic/crash if one of the workflow's tables has been dropped. For example:
I1202 21:37:50.292979 1 controller.go:117] Run finished for vdiff a35fef67-2899-4c57-951b-d87bf48100bf
panic: runtime error: index out of range [0] with length 0
goroutine 553 [running]:
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff.(*tableDiffer).getSourcePKCols(0xc000043680)
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff/table_differ.go:950 +0x10d8
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff.(*workflowDiffer).buildPlan(0xc000b3d830, {0x37eee30, 0xc000a932a8}, 0xc000dc47e0, 0xc000fe2aa0)
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff/workflow_differ.go:441 +0x81d
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff.(*workflowDiffer).diff(0xc000b3d830, {0x37d8e70, 0xc00067e230})
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff/workflow_differ.go:323 +0x379
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff.(*controller).start(0xc0003ef3b0, {0x37d8e70, 0xc00067e230}, {0x37eee30, 0xc0010258d8})
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff/controller.go:254 +0x953
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff.(*controller).run(0xc0003ef3b0, {0x37d8e70, 0xc00067e230})
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff/controller.go:143 +0x52d
created by vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff.(*Engine).addController in goroutine 69
vitess.io/vitess/go/vt/vttablet/tabletmanager/vdiff/engine.go:229 +0x1f0
Upon investigation, we could see that one of the tables in the workflow had been DROPped in the source keyspace which then lead to this. So we need to make VDiff more robust in such cases.
h/t to @FancyFane for the investigation. ❤️
Reproduction Steps
git checkout main && make build
cd examples/local
alias vtctldclient='command vtctldclient --server=localhost:15999'
./101_initial_cluster.sh && ./201_customer_tablets.sh && ./202_move_tables.sh
commerce_primary_uid=$(vtctldclient GetTablets --keyspace commerce --tablet-type primary --shard "0" | awk '{print $1}' | cut -d- -f2 | bc)
mysql -u root --socket "${VTDATAROOT}/vt_0000000${commerce_primary_uid}/mysql.sock" vt_commerce -e "drop table corder"
vtctldclient vdiff create --target-keyspace customer --workflow commerce2customer
mysql -e "show vitess_tablets"
You will see that the target/customer primary tablet is not healthy (as the vttablet process is gone):
❯ mysql -e "show vitess_tablets"
+-------+----------+-------+------------+-------------+------------------+-----------+----------------------+
| Cell | Keyspace | Shard | TabletType | State | Alias | Hostname | PrimaryTermStartTime |
+-------+----------+-------+------------+-------------+------------------+-----------+----------------------+
| zone1 | commerce | 0 | PRIMARY | SERVING | zone1-0000000100 | localhost | 2025-12-03T17:08:36Z |
| zone1 | commerce | 0 | REPLICA | SERVING | zone1-0000000101 | localhost | |
| zone1 | commerce | 0 | RDONLY | SERVING | zone1-0000000102 | localhost | |
| zone1 | customer | 0 | PRIMARY | NOT_SERVING | zone1-0000000200 | localhost | 2025-12-03T17:08:51Z |
| zone1 | customer | 0 | REPLICA | SERVING | zone1-0000000201 | localhost | |
| zone1 | customer | 0 | RDONLY | SERVING | zone1-0000000202 | localhost | |
+-------+----------+-------+------------+-------------+------------------+-----------+----------------------+
Binary Version
vtgate version Version: 24.0.0-SNAPSHOT (Git revision a49fba2643ade8d86982403d7b665555c279ca77 branch 'main') built on Tue Dec 2 22:32:28 UTC 2025 by matt@pslord.local using go1.25.3 darwin/arm64
Operating System and Environment details
Log Fragments
Overview of the Issue
When initializing a VDiff, the target primary tablet will panic/crash if one of the workflow's tables has been dropped. For example:
Upon investigation, we could see that one of the tables in the workflow had been
DROPped in the source keyspace which then lead to this. So we need to make VDiff more robust in such cases.h/t to @FancyFane for the investigation. ❤️
Reproduction Steps
You will see that the target/customer primary tablet is not healthy (as the
vttabletprocess is gone):Binary Version
vtgate version Version: 24.0.0-SNAPSHOT (Git revision a49fba2643ade8d86982403d7b665555c279ca77 branch 'main') built on Tue Dec 2 22:32:28 UTC 2025 by matt@pslord.local using go1.25.3 darwin/arm64Operating System and Environment details
Log Fragments