Problem
The k8s/helm/ chart has several critical bugs that prevent HA from forming and expose insecure defaults. It also has not been updated to reflect the new ha-raft Apache Ratis subsystem.
Critical Bugs
1. Environment variable expansion broken (${HOSTNAME} not resolved)
statefulset.yaml passes JVM arguments using shell syntax (${HOSTNAME}, ${rootPassword}) inside a Kubernetes command: array. Kubernetes exec-form does not perform shell substitution, so the JVM receives the literal string ${HOSTNAME} as the server name. This breaks:
- HA peer identity resolution (
RaftPeerAddressResolver.findLocalPeerId) - all nodes appear as ${HOSTNAME}
- Authentication - the password is set to the literal string
${rootPassword}
Expected: Use Kubernetes-native $(VAR_NAME) substitution and declare HOSTNAME explicitly via downward API (fieldRef: metadata.name).
2. Wrong Raft port (2424 instead of 2434)
The old binary-protocol HA used port 2424. The new ha-raft module (Apache Ratis gRPC) uses port 2434 (GlobalConfiguration.HA_RAFT_PORT). The chart defaults to 2424, so peers can never connect.
3. Invalid HA property -Darcadedb.ha.replicationIncomingHost
This property does not exist in the ha-raft subsystem. It is a leftover from the old binary-protocol HA and causes a startup warning/error.
4. HA unconditionally enabled for single-node deployments
HA JVM arguments are always emitted, even for replicaCount: 1 (dev/test deployments). HA should only be enabled when replicaCount > 1 or autoscaling.enabled: true.
5. Bootstrap deadlock - headless service missing publishNotReadyAddresses: true
Without this field, Kubernetes only publishes DNS entries for pods that have passed their readiness probe. During HA bootstrap, pods need to resolve each other before any of them is ready - causing a deadlock where no node can start because it cannot discover its peers.
6. Ingress backend points at headless service
ingress.yaml references the headless service as the ingress backend. Ingress controllers cannot route to headless services. The port key also references a non-existent $.Values.service.port instead of $.Values.service.http.port.
Additional Issues
HPA / KubernetesAutoJoin not supported
When autoscaling.enabled: true, the arcadedb.nodenames helper sizes the server list to replicaCount rather than autoscaling.maxReplicas. This means KubernetesAutoJoin cannot resolve pod ordinals beyond the initial replica count, and scale-up fails.
There is also no Raft quorum guard: HPA could scale minReplicas below floor(maxReplicas/2) + 1, causing the cluster to lose availability.
Insecure and incorrect defaults
| Setting |
Current (wrong) |
Expected |
arcadedb.defaultDatabases |
Universe[foo:bar] |
"" - no hardcoded credentials |
arcadedb.extraCommands |
development mode |
production mode |
podSecurityContext |
{} |
runAsNonRoot: true, fsGroup: 1000 |
securityContext |
{} |
runAsUser/Group: 1000, allowPrivilegeEscalation: false, capabilities.drop: [ALL] |
serviceAccount.automount |
true |
false (ArcadeDB does not call the K8s API) |
service.http.type |
LoadBalancer |
ClusterIP |
service.rpc.port |
2424 |
2434 |
No persistence by default
There is no persistence section. The chart does not create a PersistentVolumeClaim for the database directory, so data is lost on pod restart.
No NetworkPolicy
The Raft gRPC port (2434) is exposed to any workload in the cluster. RaftHAServer.java includes a security note recommending that operators restrict access to this port via NetworkPolicy rules - the chart should provide an opt-in policy.
Stale README
README.md documents the old 2424 port, wrong service type default (LoadBalancer), wrong defaultDatabases example, and is missing the persistence, networkPolicy, podSecurityContext, and securityContext sections entirely.
Affected Files
k8s/helm/Chart.yaml, k8s/helm/values.yaml, k8s/helm/templates/statefulset.yaml, k8s/helm/templates/_helpers.tpl, k8s/helm/templates/service.yaml, k8s/helm/templates/ingress.yaml, k8s/helm/templates/hpa.yaml, k8s/helm/templates/NOTES.txt, k8s/helm/templates/extra-manifests.yaml, k8s/helm/README.md
Problem
The
k8s/helm/chart has several critical bugs that prevent HA from forming and expose insecure defaults. It also has not been updated to reflect the newha-raftApache Ratis subsystem.Critical Bugs
1. Environment variable expansion broken (
${HOSTNAME}not resolved)statefulset.yamlpasses JVM arguments using shell syntax (${HOSTNAME},${rootPassword}) inside a Kubernetescommand:array. Kubernetes exec-form does not perform shell substitution, so the JVM receives the literal string${HOSTNAME}as the server name. This breaks:RaftPeerAddressResolver.findLocalPeerId) - all nodes appear as${HOSTNAME}${rootPassword}Expected: Use Kubernetes-native
$(VAR_NAME)substitution and declareHOSTNAMEexplicitly via downward API (fieldRef: metadata.name).2. Wrong Raft port (2424 instead of 2434)
The old binary-protocol HA used port
2424. The newha-raftmodule (Apache Ratis gRPC) uses port2434(GlobalConfiguration.HA_RAFT_PORT). The chart defaults to2424, so peers can never connect.3. Invalid HA property
-Darcadedb.ha.replicationIncomingHostThis property does not exist in the
ha-raftsubsystem. It is a leftover from the old binary-protocol HA and causes a startup warning/error.4. HA unconditionally enabled for single-node deployments
HA JVM arguments are always emitted, even for
replicaCount: 1(dev/test deployments). HA should only be enabled whenreplicaCount > 1orautoscaling.enabled: true.5. Bootstrap deadlock - headless service missing
publishNotReadyAddresses: trueWithout this field, Kubernetes only publishes DNS entries for pods that have passed their readiness probe. During HA bootstrap, pods need to resolve each other before any of them is ready - causing a deadlock where no node can start because it cannot discover its peers.
6. Ingress backend points at headless service
ingress.yamlreferences the headless service as the ingress backend. Ingress controllers cannot route to headless services. The port key also references a non-existent$.Values.service.portinstead of$.Values.service.http.port.Additional Issues
HPA / KubernetesAutoJoin not supported
When
autoscaling.enabled: true, thearcadedb.nodenameshelper sizes the server list toreplicaCountrather thanautoscaling.maxReplicas. This meansKubernetesAutoJoincannot resolve pod ordinals beyond the initial replica count, and scale-up fails.There is also no Raft quorum guard: HPA could scale
minReplicasbelowfloor(maxReplicas/2) + 1, causing the cluster to lose availability.Insecure and incorrect defaults
arcadedb.defaultDatabasesUniverse[foo:bar]""- no hardcoded credentialsarcadedb.extraCommandsdevelopmentmodeproductionmodepodSecurityContext{}runAsNonRoot: true,fsGroup: 1000securityContext{}runAsUser/Group: 1000,allowPrivilegeEscalation: false,capabilities.drop: [ALL]serviceAccount.automounttruefalse(ArcadeDB does not call the K8s API)service.http.typeLoadBalancerClusterIPservice.rpc.port24242434No persistence by default
There is no
persistencesection. The chart does not create aPersistentVolumeClaimfor the database directory, so data is lost on pod restart.No NetworkPolicy
The Raft gRPC port (2434) is exposed to any workload in the cluster.
RaftHAServer.javaincludes a security note recommending that operators restrict access to this port viaNetworkPolicyrules - the chart should provide an opt-in policy.Stale README
README.mddocuments the old2424port, wrong service type default (LoadBalancer), wrongdefaultDatabasesexample, and is missing thepersistence,networkPolicy,podSecurityContext, andsecurityContextsections entirely.Affected Files
k8s/helm/Chart.yaml,k8s/helm/values.yaml,k8s/helm/templates/statefulset.yaml,k8s/helm/templates/_helpers.tpl,k8s/helm/templates/service.yaml,k8s/helm/templates/ingress.yaml,k8s/helm/templates/hpa.yaml,k8s/helm/templates/NOTES.txt,k8s/helm/templates/extra-manifests.yaml,k8s/helm/README.md