Overview of the Issue
This issue is based on an initial discussion in Slack (see here for context). I'm trying to set up a subsharding configuration and have run into an error when performing a JOIN operation. The query with the join statement and the error I'm getting is below:
mysql> select users.*, akey from users join customers on (customers.customer_id = users.customer_id);
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Reproduction Steps
To reproduce this error, I have created two tables a customers table and users table. Each customer can have many
users. I would like the primary shard key to be customer_id, with the subsharding key to be user_id. The tables definitions are below:
- Table configuration
mysql> desc customers;
+-------------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------+------+-----+---------+----------------+
| customer_id | bigint unsigned | NO | PRI | NULL | auto_increment |
| akey | varchar(20) | NO | UNI | NULL | |
+-------------+-----------------+------+-----+---------+----------------+
mysql> desc users;
+----------------+-----------------+------+-----+----------+-------------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-----------------+------+-----+----------+-------------------+
| customer_id | bigint unsigned | NO | | NULL | |
| user_id | bigint unsigned | NO | PRI | NULL | auto_increment |
| ukey | varchar(20) | NO | UNI | NULL | |
+----------------+-----------------+------+-----+----------+-------------------+
- Vschema configuration for
users and customers tables
The vschema definition file I used with the users and customers tables is below:
{
"sharded": true,
"vindexes": {
"multicol_vdx": {
"type": "multicol",
"params": {
"column_count": "2",
"column_bytes": "3,5",
"column_vindex": "hash,hash"
}
},
"cust_vdx": {
"type": "hash"
}
},
"tables": {
"customers": {
"column_vindexes": [
{
"column": "customer_id",
"name": "cust_vdx"
}
],
"auto_increment": {
"column": "customer_id",
"sequence": "duo_unsharded.cust_seq"
}
},
"users": {
"column_vindexes": [
{
"columns": ["customer_id", "user_id"],
"name": "multicol_vdx"
}
],
"auto_increment": {
"column": "user_id",
"sequence": "duo_unsharded.user_seq"
}
},
}
}
- SQL files for the sequence / auto-increment columns for
users and customers tables. These are configured in a separate un-shareded cluster
-- Setup the sequence table for the user table auto_increment id.
CREATE TABLE IF NOT EXISTS user_seq (id bigint(20), next_id bigint(20), cache bigint(20), primary key(id)) COMMENT 'vitess_sequence';
INSERT INTO user_seq (id, next_id, cache) VALUES (0, 1, 3);
-- Setup the sequence table for the customer table auto_increment id.
CREATE TABLE IF NOT EXISTS cust_seq (id bigint(20), next_id bigint(20), cache bigint(20), primary key(id)) COMMENT 'vitess_sequence';
INSERT INTO cust_seq (id, next_id, cache) VALUES (0, 1, 3);
- Vschema definition for the sequence / auto-increment columns
{
"sharded": false,
"tables": {
"user_seq": {
"type": "sequence"
},
"cust_seq": {
"type": "sequence"
}
}
}
- I have a python script that inserted a couple of rows in the users table that reference a single row in the customers table. The query that causes the error is based on a JOIN statement:
mysql> select users.*, akey from users join customers on (users.customer_id = customers.customer_id);
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id: 53
- Here are the log files from my local docker container running vtgate, vttablet. and the MySQL servers.
vtgate.out.txt
vtgate.ERROR.txt
vtgate.INFO.txt
vtgate.WARNING.txt
Binary Version
I'm working with the local Docker installation which uses the following version:
vitess@6a0a6ee5c240:/vt/local$ vtgate --version
ERROR: logging before flag.Parse: E0503 17:21:29.980551 8183 syslogger.go:149] can't connect to syslog
Version: 17.0.0-SNAPSHOT (Git revision 4ab581ceab76e869e62e3028dae06f53293d2eff branch 'duo-poc') built on Wed May 3 12:58:47 UTC 2023 by vitess@buildkitsandbox using go1.20.1 linux/amd64
### Operating System and Environment details
This is error was found using the local docker built image, which is running on a Mac OS Ventura Version 13.3.1 (22E261). The docker container info:
vitess@6a0a6ee5c240:/vt/local$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
vitess@6a0a6ee5c240:/vt/local$ uname -sr
Linux 5.15.49-linuxkit
vitess@6a0a6ee5c240:/vt/local$ uname -m
x86_64
Log Fragments
From the vtgate.Error log file:
E0502 20:40:22.503332 6391 server.go:348] mysql_server caught panic:
this switch should be exhaustive
vitess.io/vitess/go/vt/vtgate/planbuilder/operators/sharded_routing.go:330 (0x1036325)
This leads to a function in the sharded_routing.go:330:
func (tr *ShardedRouting) Cost() int {
switch tr.RouteOpCode {
case engine.EqualUnique:
return 1
case engine.Equal:
return 5
case engine.IN:
return 10
case engine.MultiEqual:
return 10
case engine.Scatter:
return 20
default:
panic("this switch should be exhaustive")
}
}
I added the following case:
case engine.SubShard:
return 5
NOTE: This addition fixed the error message and I was able to successfully execute the JOIN query.
Overview of the Issue
This issue is based on an initial discussion in Slack (see here for context). I'm trying to set up a subsharding configuration and have run into an error when performing a JOIN operation. The query with the
joinstatement and the error I'm getting is below:Reproduction Steps
To reproduce this error, I have created two tables a customers table and users table. Each customer can have many
users. I would like the primary shard key to be customer_id, with the subsharding key to be user_id. The tables definitions are below:
usersandcustomerstablesThe
vschemadefinition file I used with theusersandcustomerstables is below:usersandcustomerstables. These are configured in a separate un-shareded clustervtgate.out.txt
vtgate.ERROR.txt
vtgate.INFO.txt
vtgate.WARNING.txt
Binary Version
This is error was found using the local docker built image, which is running on a Mac OS Ventura Version 13.3.1 (22E261). The docker container info:
vitess@6a0a6ee5c240:/vt/local$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
vitess@6a0a6ee5c240:/vt/local$ uname -sr
Linux 5.15.49-linuxkit
vitess@6a0a6ee5c240:/vt/local$ uname -m
x86_64
Log Fragments
This leads to a function in the
sharded_routing.go:330:I added the following case:
NOTE: This addition fixed the error message and I was able to successfully execute the JOIN query.