Context
The Ratis gRPC transport used for Raft log replication, leader election, and snapshot chunk transfer currently has no peer authentication and no in-transit encryption. Today the only inter-node protections are:
X-ArcadeDB-Cluster-Token header on the HTTP side channels (snapshot download, database verify)
- Network-level isolation (K8s NetworkPolicy, private subnet, VPN) called out as the recommended hardening step in
RaftHAServer.startService() warnings
The Raft gRPC port remains an unauthenticated back door: any host that can reach the port can open a gRPC stream to the Ratis server and inject log entries.
A follow-up change will add a ServerTransportFilter-based peer-address allowlist to close the "random host on the network knows the port" case out of the box. That mitigation is IP-based and is defeated by IP spoofing on a flat L2 network or by a compromised peer. Production deployments need cryptographic peer identity and in-transit encryption.
Goal
Add optional mTLS to the Raft gRPC transport using the GrpcTlsConfig mechanism already supported by Apache Ratis (GrpcConfigKeys.TLS.setConf, GrpcConfigKeys.Admin.setTlsConf, GrpcConfigKeys.Client.setTlsConf, GrpcConfigKeys.Server.setTlsConf).
When enabled:
- Every Raft gRPC connection between nodes negotiates TLS with mutual client-certificate authentication.
- Peer identity is bound to the certificate (CN/SAN), verified against a shared cluster CA.
- All AppendEntries / InstallSnapshot / RequestVote traffic is encrypted in transit.
- Unauthenticated or non-CA-signed peers are rejected at the TLS handshake.
Proposed configuration surface
New settings under arcadedb.ha.tls.*:
arcadedb.ha.tls.enabled - boolean, default false (preserve zero-config dev/test)
arcadedb.ha.tls.certChainFile - PEM file with this node's certificate (and intermediates if any)
arcadedb.ha.tls.privateKeyFile - PEM file with this node's private key
arcadedb.ha.tls.trustCertCollectionFile - PEM file with the cluster CA certificate(s)
arcadedb.ha.tls.mutualAuth - boolean, default true
Wiring in RaftPropertiesBuilder / RaftHAServer:
- When
arcadedb.ha.tls.enabled=true, build a GrpcTlsConfig from the configured files and install it via GrpcConfigKeys.TLS.setConf(parameters, tlsConf) on the Parameters object passed to RaftServer.Builder.setParameters(...).
- Apply the same
GrpcTlsConfig on the RaftClient.Builder.setParameters(...) used by RaftHAServer.buildRaftClient() so the leader's self-client also speaks TLS.
- Fail-fast on startup if TLS is enabled but any of the cert/key/trust paths are missing or unreadable, with a clear error message.
Documentation deliverables
docs/arcadedb-ha-*.md section on production hardening, including:
openssl-based recipe for a self-signed cluster CA and per-node cert (already drafted in internal notes)
- Cert-manager recipe for Kubernetes StatefulSet deployments (SAN must match the pod's stable DNS name)
- Vault PKI / internal-PKI guidance for orgs that already run a CA
- Operational notes on rotation and expiry
- Release-notes callout: mTLS is the supported way to secure the Raft gRPC port; the peer-address allowlist is a best-effort default, not a substitute.
Test plan
- Unit test:
RaftPropertiesBuilder builds a correct GrpcTlsConfig from the new settings and attaches it to Parameters.
- Integration test: 3-node
BaseRaftHATest subclass that boots with mTLS enabled using a test CA under src/test/resources; verifies a full leader election + transaction commit + follower catch-up round-trip.
- Negative integration test: a 4th node with a cert signed by a different CA attempts to join; handshake is rejected and no log entries are accepted from it.
- Snapshot HTTP handler: confirm the existing
X-ArcadeDB-Cluster-Token check on SnapshotHttpHandler is preserved (TLS on gRPC does not replace the HTTP token).
Out of scope for this issue
- Cert rotation without restart (can be a follow-up once the static-config path lands)
- Integrating with an external KMS or Vault agent for cert retrieval (leave it to ops; document the pattern)
Context
The Ratis gRPC transport used for Raft log replication, leader election, and snapshot chunk transfer currently has no peer authentication and no in-transit encryption. Today the only inter-node protections are:
X-ArcadeDB-Cluster-Tokenheader on the HTTP side channels (snapshot download, database verify)RaftHAServer.startService()warningsThe Raft gRPC port remains an unauthenticated back door: any host that can reach the port can open a gRPC stream to the Ratis server and inject log entries.
A follow-up change will add a
ServerTransportFilter-based peer-address allowlist to close the "random host on the network knows the port" case out of the box. That mitigation is IP-based and is defeated by IP spoofing on a flat L2 network or by a compromised peer. Production deployments need cryptographic peer identity and in-transit encryption.Goal
Add optional mTLS to the Raft gRPC transport using the
GrpcTlsConfigmechanism already supported by Apache Ratis (GrpcConfigKeys.TLS.setConf,GrpcConfigKeys.Admin.setTlsConf,GrpcConfigKeys.Client.setTlsConf,GrpcConfigKeys.Server.setTlsConf).When enabled:
Proposed configuration surface
New settings under
arcadedb.ha.tls.*:arcadedb.ha.tls.enabled- boolean, defaultfalse(preserve zero-config dev/test)arcadedb.ha.tls.certChainFile- PEM file with this node's certificate (and intermediates if any)arcadedb.ha.tls.privateKeyFile- PEM file with this node's private keyarcadedb.ha.tls.trustCertCollectionFile- PEM file with the cluster CA certificate(s)arcadedb.ha.tls.mutualAuth- boolean, defaulttrueWiring in
RaftPropertiesBuilder/RaftHAServer:arcadedb.ha.tls.enabled=true, build aGrpcTlsConfigfrom the configured files and install it viaGrpcConfigKeys.TLS.setConf(parameters, tlsConf)on theParametersobject passed toRaftServer.Builder.setParameters(...).GrpcTlsConfigon theRaftClient.Builder.setParameters(...)used byRaftHAServer.buildRaftClient()so the leader's self-client also speaks TLS.Documentation deliverables
docs/arcadedb-ha-*.mdsection on production hardening, including:openssl-based recipe for a self-signed cluster CA and per-node cert (already drafted in internal notes)Test plan
RaftPropertiesBuilderbuilds a correctGrpcTlsConfigfrom the new settings and attaches it toParameters.BaseRaftHATestsubclass that boots with mTLS enabled using a test CA undersrc/test/resources; verifies a full leader election + transaction commit + follower catch-up round-trip.X-ArcadeDB-Cluster-Tokencheck onSnapshotHttpHandleris preserved (TLS on gRPC does not replace the HTTP token).Out of scope for this issue