When performing SELECT FOR UPDATE on a multi-column-family table under read committed isolation, we are only locking the first column family, not the column family selected by the statement. This is incorrect.
CREATE TABLE abcde (a INT PRIMARY KEY, b INT, c INT, d INT, e INT, FAMILY (a, b), FAMILY (c), FAMILY (d), FAMILY (e));
INSERT INTO abcde VALUES (5, 6, 7, 8, 9);
SET CLUSTER SETTING sql.txn.read_committed_isolation.enabled = true;
-- first connection
BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
SELECT e FROM abcde WHERE a = 5 FOR UPDATE;
-- second connection
SET enable_implicit_select_for_update = false;
-- this succeeds immediately, but should block
UPDATE abcde SET e = 19 WHERE a = 5;
-- this blocks, but should succeed immediately
UPDATE abcde SET b = 16 WHERE a = 5;
-- back to first connection
-- this returns 19 but should return 9
SELECT e FROM abcde WHERE a = 5 FOR UPDATE;
COMMIT;
demo@127.0.0.1:26257/demoapp/defaultdb OPEN> SELECT e FROM abcde WHERE a = 5 FOR UPDATE;
e
-----
9
(1 row)
Time: 3ms total (execution 3ms / network 0ms)
timestamp | age | message | tag | location | operation | span
--------------------------------+-----------------+----------------------------------------------------------------------------------------------+------------------------------------------------------------+-----------------------------------------+------------------+-------
2023-11-22 22:45:24.178507+00 | 00:00:00.001131 | Scan /Tenant/2/Table/104/1/5/0, /Tenant/2/Table/104/1/5/3/1 | [n1,client=127.0.0.1:65289,hostssl,user=demo] | sql/row/kv_batch_fetcher.go:535 | colbatchscan | 10
2023-11-22 22:45:24.178546+00 | 00:00:00.00117 | querying next range at /Tenant/2/Table/104/1/5/0 | [n1,client=127.0.0.1:65289,hostssl,user=demo,txn=ca9a5198] | kv/kvclient/kvcoord/range_iter.go:188 | dist sender send | 12
2023-11-22 22:45:24.178588+00 | 00:00:00.001212 | key: /Tenant/2/Table/104/1/5/0, desc: r70:/Tenant/{2/Table/104-3} [(n1,s1):1, next=2, gen=3] | [n1,client=127.0.0.1:65289,hostssl,user=demo,txn=ca9a5198] | kv/kvclient/kvcoord/range_iter.go:225 | dist sender send | 12
2023-11-22 22:45:24.178621+00 | 00:00:00.001245 | r70: sending batch 2 Get to (n1,s1):1 | [n1,client=127.0.0.1:65289,hostssl,user=demo,txn=ca9a5198] | kv/kvclient/kvcoord/dist_sender.go:2364 | dist sender send | 12
2023-11-22 22:45:24.179456+00 | 00:00:00.00208 | fetched: /abcde/abcde_pkey/5 -> <undecoded> | [n1,client=127.0.0.1:65289,hostssl,user=demo] | sql/colfetcher/cfetcher.go:1021 | colbatchscan | 10
2023-11-22 22:45:24.179469+00 | 00:00:00.002093 | fetched: /abcde/abcde_pkey/5/e -> 9 | [n1,client=127.0.0.1:65289,hostssl,user=demo] | sql/colfetcher/cfetcher.go:1021 | colbatchscan | 10
2023-11-22 22:45:24.179583+00 | 00:00:00.002207 | Scan /Tenant/2/Table/104/1/5/0 | [n1,client=127.0.0.1:65289,hostssl,user=demo] | sql/row/kv_batch_fetcher.go:535 | join reader | 8
2023-11-22 22:45:24.179623+00 | 00:00:00.002247 | querying next range at /Tenant/2/Table/104/1/5/0 | [n1,client=127.0.0.1:65289,hostssl,user=demo,txn=ca9a5198] | kv/kvclient/kvcoord/range_iter.go:188 | dist sender send | 15
2023-11-22 22:45:24.179652+00 | 00:00:00.002276 | key: /Tenant/2/Table/104/1/5/0, desc: r70:/Tenant/{2/Table/104-3} [(n1,s1):1, next=2, gen=3] | [n1,client=127.0.0.1:65289,hostssl,user=demo,txn=ca9a5198] | kv/kvclient/kvcoord/range_iter.go:225 | dist sender send | 15
2023-11-22 22:45:24.179671+00 | 00:00:00.002295 | r70: sending batch 1 Get to (n1,s1):1 | [n1,client=127.0.0.1:65289,hostssl,user=demo,txn=ca9a5198] | kv/kvclient/kvcoord/dist_sender.go:2364 | dist sender send | 15
2023-11-22 22:45:24.180601+00 | 00:00:00.003225 | fetched: /abcde/abcde_pkey/? -> <undecoded> | [n1,client=127.0.0.1:65289,hostssl,user=demo] | sql/row/fetcher.go:1145 | join reader | 8
2023-11-22 22:45:24.180776+00 | 00:00:00.0034 | rows affected: 1 | [n1,client=127.0.0.1:65289,hostssl,user=demo] | sql/conn_executor_exec.go:1577 | sql query | 2
(12 rows)
Time: 2ms total (execution 2ms / network 0ms)
When performing SELECT FOR UPDATE on a multi-column-family table under read committed isolation, we are only locking the first column family, not the column family selected by the statement. This is incorrect.
Here's how to demonstrate the problem:
If we look with
\set auto_trace=kvwe can see the locking semi lookup join is reading/Tenant/2/Table/104/1/5/0instead of/Tenant/2/Table/104/1/5/3:Jira issue: CRDB-33842