Skip to content

sql: distsql plans against unavailable node #23601

@solongordon

Description

@solongordon

After stopping one node in a four node cluster, I noticed that distsql inconsistently continues to include that node in physical plans, which causes errors in query execution. This is reproducible on a local cluster.

# Start a local four node cluster.
roachprod create local --nodes 4
roachprod start local

# Create a table with leaseholders on all four nodes.
roachprod sql local:1 <<EOF
CREATE DATABASE IF NOT EXISTS d;
DROP TABLE IF EXISTS d.t;
CREATE TABLE d.t (x INT PRIMARY KEY);
INSERT INTO d.t VALUES (1), (2), (3), (4);
ALTER TABLE d.t SPLIT AT VALUES (2), (3), (4);
EOF

sleep 1

roachprod sql local:1 <<EOF
ALTER TABLE d.t TESTING_RELOCATE VALUES
  (ARRAY[1,2,3], 1),
  (ARRAY[2,3,4], 2),
  (ARRAY[3,4,1], 3),
  (ARRAY[4,1,2], 4);
EOF

# Stop one of the nodes.
roachprod stop local:4

Now if you repeatedly run SELECT * FROM d.t on node 1, about half the time it will give the correct result, but the other half it will throw an rpc error:

roachprod sql local:1 <<EOF
SELECT * FROM d.t;
EOF

pq: initial connection heartbeat failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure

And if you repeatedly request the distsql explain plan, about half the time it will incorrectly include the stopped node:

roachprod sql local:1 <<EOF
SELECT "URL" FROM [EXPLAIN (DISTSQL) SELECT * FROM d.t];
EOF

https://cockroachdb.github.io/distsqlplan/decode.html?eJzEkT1rwzAQhvf-jHdWwR_poqlrlrSEbkWDah3B4OjESYaW4P9eLA2pg1taAumou3vuOfGe4NnRzh4pQr-ihkIDhRYKGxiFINxRjCxzuwxv3Tt0pdD7MKa5bBQ6FoI-IfVpIGi82LeB9mQdCRQcJdsPWRCkP1r5eEwwkwKP6bwjJnsg6HpS33jO60fP4kjILZabaeWSHd9zuBhbFzcLcX2jDzY38rT_ENiKZ08xsI_0q0SqOVByByrpRx6lo2fhLmvK8ylzueAoptKty2Prcysf-BWuf4QfFnB1CTfXmNtr4M2fYDPdfQYAAP__FSJJrw==

The behavior persists after decommissioning the stopped node.

From looking at session traces, it looks to me like distsql always tries to plan against the stopped node, but there is a race condition where sometimes it discovers that it is unhealthy while planning and adjusts, but sometimes it doesn't and fails upon execution.

Metadata

Metadata

Assignees

Labels

C-performancePerf of queries or internals. Solution not expected to change functional behavior.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions