-
Notifications
You must be signed in to change notification settings - Fork 4.1k
server: back-connect after node join fails with mismatched node ID #54762
Description
Hey folks,
Currently we have an existing cluster, and would like to add additional nodes to the cluster.
The new node was able to establish connection with node n0 and joined the gossip network
node connected via gossip and verified as part of cluster "xxxxxx-xxxxxxx-xxxxxx".
However when the new node is trying to connect to the next node we see the following error:
initial connection heartbeat failed: rpc error: code = Unknown desc = client requested node ID 2 doesn't match server node ID 1.
Could you please explain what the error means. What didn't match, and how can we fix this issue?
Thanks in advance.
To Reproduce
We created a cluster running secure mode. We made sure that all nodes had the same ca.crt as well as making sure that any previous data directories that the new node could read from was deleted, ensuring that it didn't have metadata of any other clusters. The node boots up, joins the gossip network but fails to join the next node then spams the same error.
Expected behavior
We have created other clusters with no issue, and we expected this to just work
Environment:
- CockroachDB version 20.1.1
- Server OS: GKE's Container Optimized OS
Additional context
https://cockroachdb.slack.com/archives/CP4D9LD5F/p1600975349012100
Epic: CRDB-8500
Jira issue: CRDB-3720