You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PeerAddressAllowlistFilter can build an incomplete allowlist when cluster peer hostnames do not resolve at the moment resolveNow() runs, and it then actively rejects legitimate peers (throwing SecurityException) instead of failing open during the DNS-resolution window. In Kubernetes StatefulSet deployments this is hit routinely on startup and on pod restarts, where headless-service A records are only published once a pod is Ready and pod IPs change across restarts.
Observed in a customer 3-node k8s cluster (26.6.1-SNAPSHOT). It both destabilizes initial cluster bootstrap and can block a restarted/wiped follower from rejoining and receiving its snapshot.
Evidence (from customer logs)
At startup, peer hostnames fail to resolve, so they are silently dropped from the allowlist:
WARNI [PeerAddressAllowlistFilter] Cannot resolve cluster peer host 'splitter-2.splitter.sbu.svc.cluster.local' for Raft gRPC allowlist: ... Name does not resolve
WARNI [PeerAddressAllowlistFilter] Cannot resolve cluster peer host 'splitter-1.splitter.sbu.svc.cluster.local' for Raft gRPC allowlist: ... Name does not resolve
Then valid peers are rejected because their IP is not in the (incomplete) allowlist:
resolveNow() (called from the constructor and on a miss) iterates peerHosts and, on UnknownHostException, only logs a WARNING and drops the host from the resolved set. There is no record that the host failed, no retry state, and no distinction between "resolved to no addresses" and "DNS not ready yet".
transportReady() re-resolves on a miss but rate-limited by refreshIntervalMs (arcadedb.ha.grpcAllowlistRefreshMs, default 30s). Since the constructor just resolved, the first wave of inbound connections after startup is inside that window, so the re-resolve is skipped and the peer is rejected immediately.
If a peer's DNS is still unpublished on the next allowed re-resolve (pod not yet Ready), it stays rejected until both the refresh interval elapses and DNS resolves, with gRPC DNS caching adding further lag.
The net effect is a self-inflicted partition during the exact window (bootstrap / rolling restart) when the cluster most needs peers to connect.
Proposed hardening
Some combination of:
Force a re-resolve on a miss whenever the allowlist is known-incomplete (i.e. fewer hosts resolved than configured), bypassing the refreshIntervalMs rate limit. Keep the rate limit only for the steady state where all peers already resolved at least once.
Track per-host last-known-good IPs and keep them in the allowlist (with a TTL) so a transient DNS blip does not evict a peer that resolved fine moments ago (sticky entries). This directly addresses pod-IP churn + caching.
Startup grace window: until the first complete resolution of all peer hosts succeeds, prefer logging + allowing (fail-open) over rejecting, or at minimum retry resolution aggressively. The filter is explicitly documented as "NOT a substitute for mTLS" (Raft gRPC: add support for mTLS for encryption and peer identity #3890), so a short fail-open bootstrap window is an acceptable trade-off and far safer operationally than locking the cluster out of itself.
Optionally proactively retry unresolved hosts on a background tick rather than only on inbound misses, so the allowlist converges even while the cluster is quiet.
Impact / workaround
Contributes to leader-election churn and follower divergence during bulk load on k8s.
Can block a restarted or wiped follower from rejoining and re-acquiring a snapshot from the leader.
Current workaround: temporarily set arcadedb.ha.peerAllowlist.enabled=false during bootstrap/recovery, then re-enable.
Summary
PeerAddressAllowlistFiltercan build an incomplete allowlist when cluster peer hostnames do not resolve at the momentresolveNow()runs, and it then actively rejects legitimate peers (throwingSecurityException) instead of failing open during the DNS-resolution window. In Kubernetes StatefulSet deployments this is hit routinely on startup and on pod restarts, where headless-service A records are only published once a pod isReadyand pod IPs change across restarts.Observed in a customer 3-node k8s cluster (26.6.1-SNAPSHOT). It both destabilizes initial cluster bootstrap and can block a restarted/wiped follower from rejoining and receiving its snapshot.
Evidence (from customer logs)
At startup, peer hostnames fail to resolve, so they are silently dropped from the allowlist:
Then valid peers are rejected because their IP is not in the (incomplete) allowlist:
Root cause
In
ha-raft/.../PeerAddressAllowlistFilter.java:resolveNow()(called from the constructor and on a miss) iteratespeerHostsand, onUnknownHostException, only logs a WARNING and drops the host from the resolved set. There is no record that the host failed, no retry state, and no distinction between "resolved to no addresses" and "DNS not ready yet".transportReady()re-resolves on a miss but rate-limited byrefreshIntervalMs(arcadedb.ha.grpcAllowlistRefreshMs, default 30s). Since the constructor just resolved, the first wave of inbound connections after startup is inside that window, so the re-resolve is skipped and the peer is rejected immediately.Ready), it stays rejected until both the refresh interval elapses and DNS resolves, with gRPC DNS caching adding further lag.The net effect is a self-inflicted partition during the exact window (bootstrap / rolling restart) when the cluster most needs peers to connect.
Proposed hardening
Some combination of:
refreshIntervalMsrate limit. Keep the rate limit only for the steady state where all peers already resolved at least once.Impact / workaround
arcadedb.ha.peerAllowlist.enabled=falseduring bootstrap/recovery, then re-enable.Related
Files
ha-raft/src/main/java/com/arcadedb/server/ha/raft/PeerAddressAllowlistFilter.javaarcadedb.ha.grpcAllowlistRefreshMs,arcadedb.ha.peerAllowlist.enabled