When there are pre-populated tablet controls on the target keyspace, MoveTables SwitchTraffic will break with an error that requires manual cleanup before reads and writes can resume. This occurs, when the TabletControls has a list of denied tables rules that don't match the currently running workflow. If the workflow's tables don't match the TabletControls 1 for 1; then an error results.
Any traffic sent after this point will result in continued errors from the application until we removed the TabletControls and Refreshed the Shard State.
$ vtctlclient --server :15999 GetShard fane_import_sharded/-80
{
...
"tablet_controls": [
{
"tablet_type": 1,
"cells": [],
"denied_tables": [
"sbtest1",
"sbtest2",
"sbtest3",
"sbtest4",
"sbtest5",
"sbtest6",
"testing"
],
...
}
$ vtctlclient --server :15999 Workflow fane_import_sharded.import-shard-80 show
{
"Workflow": "import-shard-80",
"SourceLocation": {
"Keyspace": "fane_import_sharded_source",
"Shards": [
"-80"
]
},
"TargetLocation": {
"Keyspace": "fane_import_sharded",
"Shards": [
"-80"
]
},
"MaxVReplicationLag": 1,
"MaxVReplicationTransactionLag": 1,
"Frozen": false,
"ShardStatuses": {
"-80/aws_useast1a_6-3337899395": {
"PrimaryReplicationStatuses": [
{
"Shard": "-80",
"Tablet": "aws_useast1a_6-3337899395",
"ID": 6,
"Bls": {
"keyspace": "fane_import_sharded_source",
"shard": "-80",
"filter": {
"rules": [
{
"match": "sbtest1",
"filter": "select * from sbtest1 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest2",
"filter": "select * from sbtest2 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest3",
"filter": "select * from sbtest3 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest4",
"filter": "select * from sbtest4 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest5",
"filter": "select * from sbtest5 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest6",
"filter": "select * from sbtest6 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest7",
"filter": "select * from sbtest7 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "sbtest8",
"filter": "select * from sbtest8 where in_keyrange(id, 'fane_import_sharded.hash', '-80')"
},
{
"match": "testing",
"filter": "select * from testing"
}
]
}
},
"Pos": "7c3368f8-5412-11ee-8179-0a26551b1c25:1-1584,7c434390-5412-11ee-8c60-0a26551b1c25:1",
"StopPos": "",
"State": "Running",
"DBName": "fane_import_sharded",
"TransactionTimestamp": 0,
"TimeUpdated": 1694815788,
"TimeHeartbeat": 1694815788,
"TimeThrottled": 0,
"ComponentThrottled": "",
"Message": "",
"Tags": "",
"WorkflowType": "MoveTables",
"WorkflowSubType": "Partial",
"CopyState": null,
"RowsCopied": 0
}
],
"TabletControls": [
{
"tablet_type": 1,
"denied_tables": [
"sbtest1",
"sbtest2",
"sbtest3",
"sbtest4",
"sbtest5",
"testing"
]
}
],
"PrimaryIsServing": true
}
},
"SourceTimeZone": "",
"TargetTimeZone": ""
}
$ vtctlclient --server :15999 MoveTables SwitchTraffic fane_import_sharded.import-shard-80
E0915 22:10:10.097662 696 main.go:96] E0915 22:10:10.097104 traffic_switcher.go:625] allowTargetWrites failed: Code: INVALID_ARGUMENT
cannot remove tables since one or more do not exist in the denylist
E0915 22:10:10.114269 696 main.go:96] E0915 22:10:10.113676 vtctl.go:2215]
cannot remove tables since one or more do not exist in the denylist
The following vreplication streams exist for workflow fane_import_sharded.import-shard-80:
id=6 on -80/aws_useast1a_6-3337899395: Status: Stopped. VStream Lag: 0s.
MoveTables Error: rpc error: code = Unknown desc = cannot remove tables since one or more do not exist in the denylist
E0915 22:10:10.216399 696 main.go:105] remote error: rpc error: code = Unknown desc = cannot remove tables since one or more do not exist in the denylist
$ sysbench --db-driver=mysql --threads=1 --events=0 --time=0 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-db=fane_import_sharded /usr/share/sysbench/oltp_insert.lua --tables=5 run
WARNING: Both event and time limits are disabled, running an endless test
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Initializing worker threads...
Threads started!
FATAL: mysql_drv_query() returned error 1105 (target: fane_import_sharded_source.-80.primary: vttablet: rpc error: code = FailedPrecondition desc = disallowed due to rule: enforce denied tables (CallerID: admin)) for query 'INSERT INTO sbtest4 (id, k, c, pad) VALUES (0, 4098, '09169823527-14773847787-63328771402-43563606289-98835554319-17838113855-09276254645-46412092895-40264640011-92712584350', '67793249909-86081288100-12979568721-26815841297-77951231372')'
FATAL: `thread_run' function failed: /usr/share/sysbench/oltp_insert.lua:61: SQL error, errno = 1105, state = 'HY000': target: fane_import_sharded_source.-80.primary: vttablet: rpc error: code = FailedPrecondition desc = disallowed due to rule: enforce denied tables (CallerID: admin)
vtctldclient --server localhost:15999 SetShardTabletControl --remove fane_import_sharded_source/-80 primary;
vtctldclient --server localhost:15999 RefreshStateByShard fane_import_sharded_source/-80;
$ sysbench --db-driver=mysql --threads=1 --events=0 --time=0 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-db=fane_import_sharded /usr/share/sysbench/oltp_insert.lua --tables=5 run
WARNING: Both event and time limits are disabled, running an endless test
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Initializing worker threads...
Threads started!
Overview of the Issue
When there are pre-populated tablet controls on the target keyspace, MoveTables SwitchTraffic will break with an error that requires manual cleanup before reads and writes can resume. This occurs, when the TabletControls has a list of denied tables rules that don't match the currently running workflow. If the workflow's tables don't match the TabletControls 1 for 1; then an error results.
Any traffic sent after this point will result in continued errors from the application until we removed the TabletControls and Refreshed the Shard State.
Related Issue: #13998
Reproduction Steps
See Issue: #13998
Recovery Steps
Binary Version
Operating System and Environment details
Log Fragments