Skip to content

rpc: refuse incoming connections unless dialback succeeds #84289

@knz

Description

@knz

Is your feature request related to a problem? Please describe.

Followup to #49220 / cockroachlabs/support#1690

We know of multiple situations where a cluster can find itself into an asymmetric partition, whtiich causes all kinds of symptoms (#49220), these include at least:

  • low-level network partitions, when e.g IP packets get dropped in one direction but not the other
  • firewall partitions, when e.g. existing TCP sessions stay alive but new TCP sessions cannot be established
  • TLS misconfigurations, when two nodes use different client/server certs, so that the TLS handshake succeeds in one direction but fails in the other
  • DNS misconfigurations, where the advertised hostname of a node is not resolvable from (some) other nodes

It would be good if we had some automation to exclude nodes which appear to be partially partitioned away. (and require operator attention)

Describe the solution you'd like

We could have a couple relatively simple mechanisms to protect a cluster:

Two point-to-point mechanisms, to protect against pairwise partions:

  1. When receiving an incoming connection from another node, refuse to respond to heartbeats until we get a successful outgoing dial to the same node. In other words: when n2 connects to n1, have n1 refuse the conn until it can dial back to n2 successfully.

  2. When an outgoing dial fails, or a heartbeat fails, keep a timestamp of the failure for the remote node ID, and when receiving a heartbeat from that ID, refuse to respond to the heartbeat if there's a recent failure (and possibly actively close the connection). In other words, start refusing a heartbeat from n2 to n1, if n1 has failed to connect/heartbeat to n2 recently.

Then a cluster-wisde mechanism, to protect against global partitions (e.g. n1-n2 and n2-n3 can connect, but not n1-n3)

  1. if one of the two mechanisms above decides to exclude an ID, they would also register a gossip entry (with suitable TTL) that says the node is potentially blocked, containing the IDs of both the blocked node and the decider node. Then, on every node, use these gossip entries as follows: if a node X connects to node Y, and node Y is in a "blocked node" entry created by node Z, and node X has a valid connection to the node Z (i.e. X and Z are in the same side of the partition), then have X proactively start blocking Y too.

Jira issue: CRDB-17572

gz#13169

Epic CRDB-2488

Metadata

Metadata

Assignees

Labels

A-kv-serverRelating to the KV-level RPC serverC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)GA-blockerT-kvKV Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions